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SUCROSE-BINDING PROTEINS 

FIELD OF THE INVENTION 

This invention relates to carbohydrate metabolism in plants, and in 
5 particular to sucrose-binding proteins (SBPs). Aspects of the invention include a 
novel SBP gene isolated from soybean, and modified SBPs having enhanced 
sucrose uptake activity. Nucleic acid vectors, transgenic cells and transgenic plants 
having modified sucrose uptake activity are also provided. The invention also 
relates to promoter sequences useful for controlling expression of transgenes in 
10 plants, including SBP transgenes. 

BACKGROUND OF THE INVENTION 

The regulation of sucrose transport in plants has a major impact on plant 
growth and productivity. Through photosynthesis, plants fix atmospheric carbon 

1 5 dioxide into triose phosphates, which are then used to produce sucrose and other 
carbohydrates. These carbohydrates are then transported throughout the plant for 
use as energy sources, carbon skeletons for biosynthesis and storage for future 
growth needs. Sucrose is the major form of transported carbohydrate. The ability 
of plant cells actively to transport sucrose across the plasma membrane so that the 

20 sucrose that is mobilized in the phloem can be taken into cells for use is a critical 
step in sucrose utilization. 

The development of plant seeds involves the accumulation of carbon and 
nitrogen reserves in forms that can both withstand desiccation and be utilized as an 
energy source by the developing embryo during germination. The accumulation of 

25 carbon in developing seeds is mediated by specific plasma membrane proteins 
(Overvoorde et al., 1996; Riesmeier et al., 1992; Bush, 1993). Photoaffinity 
labeling of membranes isolated from soybean cotyledon tissue with a photolyzable 
sucrose analog identified a distinct 62 kD sucrose-binding protein, or SBP (Ripp et 
al., 1988). Analysis of the cDNA encoding the SBP and its deduced amino acid 

30 sequence indicates that the SBP contains a single hydrophobic domain at its N- 
terminus but otherwise is a hydrophilic protein lacking the expected membrane- 
spanning hydrophobic segments typically present in transport proteins (Grimes et 
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al., 1992). Biochemical analysis of the topology of the SBP demonstrates that it is 
tightly associated with the external leaflet of the plasma membrane (Overvorrde & 
Grimes, 1994). The involvement of the SBP in sucrose uptake was implicated by 
immunolocalization experiments demonstrating that the SBP is exclusively 

5 associated with the plasma membrane of cells involved in active sucrose uptake 
(Grimes et al., 1992). Kinetic analysis of SBPmediated sucrose uptake in a yeast 
system indicates that the uptake is specific for sucrose but is proton independent and 
relatively nonsaturable, thus defining a novel mechanism for sucrose uptake 
(Overvoorde etal., 1996). 

10 Sucrose uptake in developing seeds affects two significant agricultural 

characteristics of the mature seed: the carbohydrate content of the resulting seed 
grain, and the vitality of the seedling that emerges when the seed grain is planted. 
Enhanced sucrose uptake activity in developing seeds may be desirable where it is 
an advantage to increase the carbohydrate content of the seed (e.g., where the seed is 

1 5 the primary plant material harvested, such as soybean). In contrast, decreased 

sucrose uptake activity in seeds might be desirable where the vegetative material of 
the plant is harvested. Thus, plants having modified sucrose uptake activity during 
seed development would be of significant agricultural importance, and it is to such 
plants that the present invention is directed. 

20 

SUMMARY OF THE INVENTION 

The present invention provides isolated nucleic acid molecules encoding 
plant sucrose binding proteins, which are key proteins in the uptake of sucrose into 
developing seeds. In one embodiment, the invention provides modified forms of 
25 sucrose binding proteins that are shown to have enhanced sucrose uptake activity. 

The previously described sucrose binding protein from soybean (Overvoode 
et al., 1996) is herein referred to as SBP1. A new SBP is provided herein and is 
referred to as SBP2. The SBP2 polypeptide is shown to be 489 amino acid residues 
in length, and to be expressed at enhanced levels during seed development. The 
30 SBP2 polypeptide is shown to have sucrose uptake activity in a heterologous yeast 
assay system. 
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In addition, modified forms of the SBP1 and SBP2 proteins are provided 
having enhanced sucrose uptake activity. In one embodiment, such forms are 
deletion mutants in which amino acid residues are removed from the C-terminus of 
the proteins. By way of example, removal of 80 amino acid residues from the C- 
5 terminus of the SBP1 protein is shown to produce increased sucrose uptake in the 
yeast assay system. 

The invention also provides 5' regulatory regions (including promoter 
sequences) of the soybean SBP1 and SBP2 genes. These regulatory regions confer 
specific or enhanced expression in developing seeds and so may be used to express 

10 any transgene in developing seeds. 

Thus, in one aspect, the invention provides a modified plant sucrose binding 
protein wherein the modified sucrose binding protein has a modified amino acid 
sequence compared to a corresponding wild-type sucrose binding protein, and 
wherein expression of the modified sucrose binding protein in a yeast assay system 

1 5 confers enhanced sucrose uptake compared to the corresponding wild-type sucrose 
binding protein. In particular embodiments, modified sucrose binding proteins 
provided by the invention enhance sucrose uptake in the yeast assay system by at 
least 10%, and preferably by at least 25%, compared to the wild-type sucrose 
binding protein. In certain embodiments, the modified plant sucrose binding 

20 proteins have a modified amino acid sequence comprising a C-terminal truncation 

compared to the wild-type sucrose binding protein. Such a truncation is typically of 
between about 1 0 and about 1 00 amino acids, and is preferably of about 80 amino 
acids. Although such modified SBPs may be produced from any known sucrose 
binding proteins, modified forms of SBP1 and SBP2 are exemplary of the invention. 

25 Modified forms of SBP1 and SBP2 include those forms having the amino acid 
sequences shown in Seq. LD. Nos. 2 and 4, respectively. 

In another aspect of the invention, nucleic acid molecules encoding modified 
plant sucrose binding proteins are provided, together with vectors comprising such 
nucleic acid molecules. The invention also provides transgenic plants expressing 

30 modified sucrose binding proteins. Such transgenic plants may have modified 
sucrose uptake activity, particularly in developing seeds. 
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In another aspect, the invention provides an isolated nucleic acid molecule 
encoding a SBP2 sucrose binding protein or a variant of a SBP2 protein. Such 
proteins may comprise an amino acid sequence as shown in Seq. I.D. Nos. 3 and 4, 
or sequences having at least 70% and preferably at least 90% sequence identity with 

5 these sequences. Recombinant expression cassettes comprising such nucleic acid 
molecules are also provided by the invention, as are transgenic plants comprising 
such recombinant expression cassettes. 

Another aspect of the invention is a recombinant nucleic acid molecule 
comprising a promoter sequence operably linked to a nucleic acid sequence, wherein 

1 0 the promoter sequence comprises a SBP1 or SBP2 promoter. Such promoters 
preferably comprise at least 25 consecutive nucleotides of the 5' regulatory 
sequences shown in Seq. I.D. Nos. 6 and 7. In particular embodiments, the nucleic 
acid sequence comprises a plant sucrose binding protein. Transgenic plants 
comprising such recombinant nucleic acid molecules are also an aspect of the 

1 5 invention. 

These and other aspects of the invention are discussed in more detail in the 
following description. 



BRIEF DESCRIPTION OF THE DRAWINGS 

20 Fig. 1 shows an alignment of the SBP1 and SBP2 protein sequences. 

Fig. 2 is a graph showing sucrose uptake activity in the yeast assay system. 

SEQUENCE LISTING 

The nucleic and amino acid sequences listed in the sequence listing are 
25 shown using standard single-letter abbreviations for nucleotide bases, and three- 
letter code for amino acids. Only one strand of each nucleic acid sequence is 
shown, but the complementary strand is understood to be included by any reference 
to the displayed strand. 

Seq. I.D. No. 1 shows the amino acid sequence of the SBP1 protein. 
30 Seq. I.D. No. 2 shows the amino acid sequence of the truncated SBP1 

protein from which the C -terminus 80 amino acids are deleted. 

Seq. I.D. No. 3 shows the amino acid sequence of the SBP2 protein. 
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Seq. I.D. No. 4 shows the amino acid sequence of the truncated SBP2 
protein from which the C-terminus 80 amino acids are deleted. 
Seq. I.D. No. 5 shows the SBP2 cDNA sequence. 
Seq. I.D. No. 6 shows the SBP2 gene 5' regulatory region. 
5 Seq. I.D. No. 7 shows the SBP1 gene 5' regulatory region. 

Seq. I.D. Nos. 8-14 show oligonucleotides that may be used to amplify 
various regions of the SBP2 cDNA or 5' regulatory region. 



DETAILED DESCRIPTION OF THE INVENTION 

10 

I. Methods 

Standard molecular biology methods may be used to practice the present 
invention. Such methods are described in many publications, including Sambrook 
et al., (1989), Ausubel et al. (1994), Innis et al. (1990), Weissbach & Weissbach 
15 (1989), Tijssen (1993). 



II. Definitions 

Unless otherwise noted, technical terms are used according to conventional 
usage. Definitions of common terms in molecular biology may be found in 

20 Benjamin Lewin, Genes V published by Oxford University Press, 1994 (ISBN 0-19- 
854287-9); Kendrew et al (eds.), The Encyclopedia of Molecular Biology, published 
by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); and Robert A. Meyers 
(ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference^ 
published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8). The nomenclature 

25 for DNA bases as set forth at 37 CFR § 1.822 and the standard three letter codes for 
amino acid residues are used herein. 

In order to facilitate review of the various embodiments of the invention, the 
following definitions of terms is provided: 

Sucrose binding protein (SBP) SBPs are involved in sucrose uptake in 

30 plants. This activity can be conveniently determined and measured using the yeast 
sucrose uptake assay originally described by Overvoorde et al. (1996), which is also 
described in detail below; in this assay system, SBPs confer sucrose uptake ability 
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on yeast cells that are otherwise unable to take up sucrose. Use of the term SBP 
refers generally to any sucrose binding protein, including the sucrose binding 
protein previously described by Grimes et al. (1992). This invention provides a 
cDNA encoding a previously unreported sucrose binding protein, the SBP2 protein 
5 from soybean (Glycine max). However the invention is not limited to this particular 
SBP: other nucleotide sequences which encode SBP enzymes are also part of the 
invention, including variants on the disclosed Glycine gene sequences and 
orthologous sequences from other plant species, the cloning of which is now 
enabled. Such sequences share the essential functional characteristic of encoding an 

10 enzyme that is capable of mediating sucrose uptake in the described yeast assay 

system. Nucleic acid sequences that encode SBPs and the proteins encoded by such 
nucleic acids share not only this functional characteristic, but also a specified level 
of sequence similarity (or sequence identity), as addressed below. The concept of 
sequence identity can also be expressed in the ability of two sequences to hybridize 

15 to each other under stringent conditions. 

The present invention also provides modified SBPs having altered functional 
characteristics, as well as nucleic acid sequences encoding such proteins. An SBP 
isolated from an untransformed (wild-type) plant may be referred to as having a 
wild-type amino acid sequence. Modified SBPs have amino acid sequences that 

20 differ from the wild-type amino acid sequence. Such differences may take the form 
of amino acid deletions, additions, substitutions or truncations. A protein having 
amino acid deletions lacks one or more of the amino acid residues present in the 
wild-type sequence; such residues may be deleted from any portion of the protein. 
In contrast, a truncated protein is one in which one or more amino acids are deleted 

25 from the N and/or C terminus of the protein. Thus, truncated proteins are a sub- 
class of proteins having amino acid deletions. 

Nucleic acid sequences encoding modified SBPs can readily be produced 
using standard methodologies, such as site directed mutagenesis and polymerase 
chain reaction amplification. 

30 Sequence identity: the similarity between two nucleic acid sequences, or two 

amino acid sequences is expressed in terms of the similarity between the sequences, 
otherwise referred to as sequence identity. Sequence identity is frequently measured 
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in terms of percentage identity (or similarity or homology); the higher the 
percentage, the more similar the two sequences are. 

The calculation of percentage of sequence identity for amino acid sequences 
may take into account conservative amino acid substitutions. Conservative amino 
5 acid substitutions involve the replacement of one amino acid residue with another 
residue having similar chemical and biological properties (e.g., charge or 
hydrophobicity). Such substitutions typically do not change the functional 
properties of the protein, and should therefore be accounted for in the calculation of 
sequence identity by assigning a value that is in between values assigned for identity 

10 (i.e., no change at that amino acid position) and non-conservative residue changes. 
Thus, conservative amino acid changes are scored as a partial rather than a full 
mismatch, thereby increasing the percentage sequence identity. For example, if an 
identical amino acid is given a score of one and a non-conservative substitution is 
given a score of zero, a conservative substitution might be given a score of 0.5. The 

15 scoring of conservative substitutions is calculated, e.g., according to the algorithm 
of Meyers and Miller (1988) e.g., as implemented in the program PC/GENE 
(Intelligenetics, Mountain View, California, USA). 

Methods of alignment of sequences for comparison are well known in the art. 
Various programs and alignment algorithms are described in: Smith and Waterman 

20 (1981); Needleman and Wunsch (1970); Pearson and Lipman (1988); Higgins and 
Sharp (1988); Higgins and Sharp (1989); Corpet et al. (1988); Huang et al. (1992); 
and Pearson et al. (1994). Altschul et al. (1994) presents a detailed consideration of 
sequence alignment methods and homology calculations. 

The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al., 

25 1990) is available from several sources, including the National Center for Biological 
Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with 
the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx. It can be 
accessed at <http://www.ncbi.nlm.nih.gov/BLAST/>. A description of how to 
determine sequence identity using this program is available at 

30 <http://www.ncbi.nlm.nih.gov/BLAST/blast_help.html> 

Homologs of the disclosed SBP2 protein are characterized by possession of 
at least 80% sequence identity counted over the full length alignmer ■*■ 
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10 



15 



20 



25 



disclosed amino acid sequence of the soybean SBP2 amino acid sequence using the 
NCBI Blast 2.0, gapped blastp set to default parameters. Such homologous peptides 
will more preferably possess at least 85%, more preferably at least 90% and still 
more preferably at least 95% sequence identity determined by this method. When 
less than the entire sequence is being compared for sequence identity, homologs will 
possess at least 90% and more preferably at least 95% and more preferably still at 
least 98% sequence identity over short windows of 10-20 amino acids. Methods for 
determining sequence identity over such short windows are described at 
<http://www.ncbi.nlm.nih.gov/BLAST/blast_FAQs.html>. Homologs having the 
sequence identities described above will also possess the ability to mediate sucrose 
uptake in the described yeast assay system. The present invention provides not only 
the peptide homologs are described above, but also nucleic acid molecules that 
encode such homologs. 

Homologs of the soybean SBP2 gene are similarly characterized by 
possession of at least 70% sequence identity counted over the full length alignment 
with the disclosed Glycine SBP2 gene sequence using the NCBI Blast 2.0, gapped 
biastn set to default parameters. Such homologous nucleic acids will more 
preferably possess at least 75%, more preferably at legist 80% and still more 
preferably at least 90% or 95% sequence identity determined by this method. When 
less than the entire sequence is being compared for sequence identity, homologs will 
possess at least 85% and more preferably at least 90% and more preferably still at 
least 95% sequence identity over 30 nucleotide windows. Homologs having the 
sequence identities described above will, in some embodiments, also encode a 
polypeptide having ability to mediate sucrose uptake in the described yeast assay 
system. However, homologs as defined above are useful for modifying sucrose 
uptake activity in transgenic plants (for example, as used in antisense constructs) 
even when they do not encode a functional peptide. 

Another indication that two nucleic acid molecules are substantially 
homologous is that the two molecules hybridize to each other under stringent 
conditions when one molecule is used as a hybridization probe, and the other is 
present in a biological sample, e.g., genomic material from a cell. Specific 
hybridization means that the molecules hybridize substantially only to each other 
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and not to other molecules that may be present in the genomic material. Stringent 
conditions are sequence dependent and are different under different environmental 
parameters. Generally, stringent conditions are selected to be about 5°C to 20°C 
lower than the thermal melting point (T m ) for the specific sequence at a defined 
5 ionic strength and pH. The T m is the temperature (under defined ionic strength and 
pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. 
Conditions for nucleic acid hybridization and calculation of stringencies can be 
found in Sambrook et al. (1989) and Tijssen (1993). Hybridization conditions and 
stringencies are further discussed below. 

10 Nucleic acid sequences that do not show a high degree of identity may 

nevertheless encode similar amino acid sequences, due to the degeneracy of the 
genetic code. It is understood that changes in nucleic acid sequence can be made 
using this degeneracy to produce multiple nucleic acid sequence that all encode 
substantially the same protein. 

15 Probes and primers: Nucleic acid probes and primers may readily be 

prepared based on the nucleic acids provided by this invention. A probe compiler; 
an isolated nucleic acid attached to a detectable label or reporter molecule. Typical 
labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. 
Methods for labeling and guidance in the choice of labels appropriate for various 

20 purposes are discussed, e.g., in Sambrook et. al. (1989) and Ausubel et al. (1987). 
Primers are short nucleic acids, preferably DN A oligonucleotides 1 5 
nucleotides or more in length. Primers may be annealed to a complementary target 
DNA strand by nucleic acid hybridization to form a hybrid between the primer and 
the target DNA strand, and then extended along the target DNA strand by a DNA 

25 polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid 
sequence, e.g., by the polymerase chain reaction (PGR), or other nucleic-acid 
amplification methods known in the art. 

Methods for preparing and using probes and primers are described, for 
example, in Sambrook et al. (1989), Ausubel et al. (1987), and Innis et al., (1990). 

30 PCR primer pairs can be derived from a known sequence, for example, by using 
computer programs intended for that purpose such as Primer (Version 0.5, © 199 
Whitehead Institute for Biomedical Research, Cambridge, MA). One of skill in the 
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art will appreciate that the specificity of a particular probe or primer increases with 
its length. Thus, for example, a primer comprising 20 consecutive nucleotides of 
the SBP1 or SBP2 gene 5' regulatory region will anneal to a target sequence (e.g., a 
corresponding SBP regulatory region from Faba bean) with a higher specificity than 

5 a corresponding primer of only 15 nucleotides. Thus, in order to obtain greater 

specificity, probes and primers may be selected that comprise 20, 25, 30, 35, 40, 50 
or more consecutive nucleotides of the nucleic acid sequences disclosed herein. 

Transformed: A transformed cell is a cell into which has been introduced a 
nucleic acid molecule by molecular biology techniques. As used herein, the term 

10 transformation encompasses all techniques by which a nucleic acid molecule might 
be introduced into such a cell, including Agrobacterium transformation, plasmid 
transformation, viral tranafection and introduction of naked DNA by 
electroporation, lipofection, and particle gun acceleration. 

Vector: A nucleic acid molecule as introduced into a host cell, thereby 

15 producing a transformed host cell. A vector may include nucleic acid sequences that 
permit it to replicate in the host cell, such as an origin of replication. A vector may 
also include one or more selectable marker genes and other genetic elements known 
in the art. 

Isolated: An "isolated" biological component (such as a nucleic acid or 
20 protein) has been substantially separated or purified away from other biological 
components in the cell of the organism in which the component naturally occurs, 
i.e., other chromosomal and extrachromosomal DNA and RNA, and proteins. 
Nucleic acids and proteins which have been "isolated" thus include nucleic acids 
and proteins purified by standard purification methods. The term also embraces 
25 nucleic acids and proteins prepared by recombinant expression in a host cell as well 
as chemically synthesized nucleic acids. 

Purified: The term purified does not require absolute purity; rather, it is 
intended as a relative term. Thus, for example, a purified SBP preparation is one in 
which the SBP is more enriched than the protein is in its natural environment within 
30 a cell. Preferably, a preparation of SBP is purified such that the SBP represents at 
least 50% of the total protein content of the preparation. 
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Operably linked: A first nucleic acid sequence is operably linked with a 
second nucleic acid sequence when the first nucleic acid sequence is placed in a 
functional relationship with the second nucleic acid sequence. For instance, a 
promoter is operably linked to a coding sequence if the promoter affects the 
5 transcription or expression of the coding sequence. Generally, operably linked 
DNA sequences are contiguous and, where necessary to join two protein coding 
regions, in the same reading frame. 

Recombinant: A recombinant nucleic acid is one that has a sequence that is 
not naturally occurring or has a sequence that is made by an artificial combination 
10 of two otherwise separated segments of sequence. This artificial combination is 
often accomplished by chemical synthesis or, more commonly, by the artificial 
manipulation of isolated segments of nucleic acids, e.g., by genetic engineering 
techniques. 

Ortholog: Two nucleotide or amino acid sequences are orthologs of each 
1 5 other if they share a common ancestral sequence and diverged when a species 
carrying that ancestral sequence split into two species. Orthologous sequence? 
also homologous sequences. 

Transgenic plant: as used herein, this term refers to a plant that contains 
recombinant genetic material not normally found in plants of this type and which 
20 has been introduced into the plant in question (or into progenitors of the plant) by 
human manipulation. Thus, a plant that is grown from a plant cell into which 
recombinant DNA is introduced by transformation is a transgenic plant, as are all 
offspring of that plant which contain the introduced DNA (whether produced 
sexually or asexually). Transgenic plants may be produced from any transformable 
25 plant species, both monocotolydenous and dicotyledenous plants, including but not 
limited to soybean, rice, wheat, barley, and maize. 

III. The SBP2 cDNA and encoded SBP2 peptide 

The nucleic acid sequence of the SBP2 cDNA is shown in Seq. I.D. No. 5, and 
30 the amino acid sequence of the SBP2 protein is shown in Seq. I.D. No. 3. A 

comparison of the amino acid sequences of SBP1 and SBP2 is shown in Fig. 1 . 
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l. 



Differential expression of SBPl and SBP2 
genes in soybean leaves and cotyledons. 



The sense and antisense RNAs of 32 P- 



labeled SBPl and SBP2 5'-flanking 



5 region were synthesized in vitro and 5.3 x 1 0 5 cpm of a SBPl sense, SBP1 antisense, 
SBP2 sense or SBP2 antisense RNA probe were hybridized with 5 \xg poly(A+) 
mRNA from soybean leaves and cotyledons. SBPl and SBP2 transcripts were 
observed to accumulate to similar levels in soybean cotyledons. In contrast, no 
SBPl and SBP2 transcripts were detected in 4-wk old soybean leaves. 



The expression patterns of the SBPl and SBP2 genes were examined in 
soybean seeds using RNase protection methods. Five stages of seed cotyledon 

1 5 development were used (Stage 1 = or < 4 mm, Stage 2 = 5-6 mm, Stage 3 = 7 mm, 
Stage 4 = 9 mm, Stage 5=11-12 mm). During cotyledon development, an SBPl 
antisense probe protected three major fragment (1 19, 1 1 1, and 97 nucleotides), 
indicating that three different transcription start sites were used. The SBPl mRNA 
level reaches a plateau at stage 3, and this expression level is maintained until stage 

20 5. In contrast, 5 protected fragments were detected when using SBP2 antisense 

probe, and SBP2 mRNA level continuously increased until seed size reached 11-12 
mm. Quantitative data indicated that SBPl mRNA level is three time more 
abundant than that of SBP2. The mRNA level of leaf tip is very low. However, low 
levels of SBPl mRNA can be observed in 3 mm leaf tips after prolonged exposure. 

25 These data indicate that both SBPl and SBP2 mRNAs are actively and differentially 
transcribed during seed development. 

IV. 5' regulatory regions of SBPl and SBP2 



10 



n. 



Differential Expression of Soybean SBPl and SBP2 genes 



30 



Given the tissue-specific expression of the SBPl and SBP2 genes, the 
regulatory regions of these genes responsible for conferring such expression are of 
interest, and may be used to regulate transgene expression in a similarly tissue- 
specific manner. 
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The 5' regulatory regions of SBP1 and SBP2 are shown in Seq. I.D. Nos. 6 
and 7, respectively. 

V. Modified SBPs having enhanced sucrose uptake activity 
5 The yeast assay system described by Overvoorde et al (1996) was used to 

determine the effect of modifying the amino acid sequence of the SBP proteins. This 
assay uses a derivative of the yeast strain susy7 (Riesmeier et aL, 1992) which has a 
spinach sucrose synthase cDNA stably integrated into its genome to mediate the 
intracellular hydrolysis of sucrose. However, this yeast strain lacks the ability to 

10 transport sucrose and so is unable to grow on a medium containing sucrose as the 
sole carbon source (Riesmeier et aL, 1992). To generate a host strain that permits 
selection for yeast transformed with a sucrose binding protein gene, the susy7 strain 
was selected for uracil auxotrophy by growth on medium containing S'-fluoroorotic 
acid (Overvoorde et al., 1996). The resulting strain, susy7/ura3 is unable to grow on 

1 5 a medium lacking uracil and containing glucose as the sole carbon source. 

Chimeric genes consisting of the yeast alcohol dehydrogenase 1 {AD HI) 
promoter, an SBP open reading frame and the ADH1 polyadenylation signal were 
constructed in the yeast vector pMK195 as described by Overvoorde et al. (1996) to 
create plasmids designated pYESBP. The susy7/ura3 yeast strain was transformed 

20 with these constructs using a small-scale LiOAc-based procedure essentially as 

described by Gietz et al. (1992). Transformed yeast were then plated on the uracil 
dropout selection medium containing 2% glucose (CM[GLU]) or 2% sucrose 
(CM[SUC]) (Ausubel et al., 1 994). 

Uptake assays were performed by growing the transformed yeast cells to an 

25 OD 600 of 0.5 to 1.3 in YPD, harvested by centrifugation, washed twice with 25 mM 
Mes-KOH, pH 5.5, 0.5 - 2.5 jiCi of ,4 C sucrose, and unlabeled sucrose at twice the 
final concentration. Aliquots of the uptake solution and cells were collected at 
specified time points, and uptake was quenched by transfer to 5 ml of ice-cold 
water. The cells were collected by filtration through glass fiber filters and washed 

30 five times with 5 ml of ice-cold water. The radioactivity taken up by the cells was 
determined by liquid scintillation counting. All uptake assays were performed in a 
final concentration of 1 mM sucrose. 
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Nucleic acid sequences encoding modified forms of the SBP1 protein were 
constructed and introduced into the pYESBP constructs described above. Fig. 2 
shows the sucrose uptake rate obtained with yeast cells transformed with the 
pMK195 vector only (filed circles), and constructs expressing the full length SBP1 

5 protein (filled square) and a truncated SBP1 protein missing the C-terminal 80 
amino acids (filled triangle). The amino acid sequence of this truncated SBP1 
protein is shown in Seq. I.D. No. 2. The truncated protein comprises residues 1-444 
of the full length SBP1. 

This surprising result indicates that enhanced sucrose uptake in plants may 

10 be achieved by introducing transgenes encoding modified SBPs. Modified SBPs 
having enhanced sucrose uptake activity include forms of SBP1 and SBP2 having 
C-terminal deletions. Such deletions include removal of about 80 amino acids from 
the C-terminal, but deletions of greater or fewer than 80 amino acids may also be 
employed. The sucrose uptake activity any particular deletion may readily be 

15 determined using the yeast sucrose uptake assay described above. Thus, by way of 
example, SBP proteins having C-terminal deletions of between 10 and 100 amino 
acids are candidates for enhanced sucrose uptake activity and may be assayed using 
this system. 



20 EXAMPLES 

The following examples are illustrative of various embodiments of the present 
invention. 

Example one: Preferred method for producing SBP nucleic acids 

25 This invention provides a SBP 2 cDNA sequence and the amino acid 

sequence of the SBP2 protein, modified SBP proteins having enhanced sucrose 
uptake activity, and 5' regulatory regions for the SBP J and SBP2 genes. The 
polymerase chain reaction (PCR) may now be utilized in a preferred method for 
producing nucleic acid sequences encoding the various SBP proteins described in 

30 the invention, as well as the SBP gene 5' regulatory regions. PCR amplification of 
cDNAs encoding the SBP proteins of the present invention may be accomplished 
either by direct PCR from a plant cDNA library or by Reverse-Transcription PCR 
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(RT-PCR) using RNA extracted from plant cells as a template. Amplification of 
SBP gene sequences and 5' regulatory regions may be accomplished by direct PCR 
amplification from plant genomic DNA, or from a plant genomic library. Methods 
and conditions for both direct PCR and RTPCR are known in the art and are 
5 described in Innis et al. (1990). 

The selection of PCR primers will be made according to the portions of the 
cDNA or gene that are to be amplified. Primers may be chosen to amplify small 
segments of the cDNA, the open reading frame, the entire cDNA molecule or the 
entire gene sequence. Variations in amplification conditions may be required to 

10 accommodate primers of differing lengths; such considerations are well known in 
the art and are discussed in Innis et al. (1990), Sambrook et al. (1989), and Ausubel 
et al (1992). By way of example only, the entire SBP 2 cDNA molecule as shown in 
Seq. I.D. No. 5 may be amplified using the following combination of primers: 
primer 1 5' TGTAAAACGACGGCCAGTGAATT 3' (Seq. I.D. No. 8) 

15 primer 2 5' GATTACGCCAAGCTCGAAATTAA 3' (Seq. I.D. No. 9) 



The open reading frame portion of the SBP2 cDNA may be amplified using the 
following primer pair: 

primer 3 5* ATGGCGACCAGAGCCAAGCTTTCTTTA 3' (Seq. I.D. No 

20 10) 

primer 4 5' CGCAACAGCGCGACGACCACGCTCGCT 3* (Seq. I.D. No. 

11) 

And a cDNA encoding a truncated version of the SBP2 protein (having the C- 
terminal 80 amino acids removed) may be amplified using the following primer 
25 pair: 

primer 3 5' ATGGCGACCAGAGCCAAGCTTTCTTTA 3" (Seq. I.D. 

No. 10) 

primer 5 5' GAAGGGATGACCAGGAGGGACAACAAA 3' (Seq. I.D. 

No. 12) 

30 The SBP2 5 regulatory sequence may be amplified using the following primer 

pair: 

primer 6 5' TTGTAAACGACGGCCAGTGAATT 3' (S- ~ "*""* ~ 
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primer 7 5' GGTGAGGTCAGTGAGGAACAACA 3' (Seq. LD. No. 14) 



These primers are illustrative only; it will be appreciated by one skilled in the art 
that many different primers may be derived from the provided nucleic acid 

5 sequences in order to amplify particular regions of these molecule. Resequencing of 
PCR products obtained by these amplification procedures is recommended; this will 
facilitate confirmation of the amplified sequence and will also provide information 
on natural variation on this sequence in different ecotypes and plant populations. 

Oligonucleotides that are derived from the SBP 2 cDNA or SBP1 and SBP2 5" 

1 0 regulatory regions are encompassed within the scope of the present invention. 

Preferably, such oligonucleotide primers will comprise a sequence of at least 15-20 
consecutive nucleotides of the SBP2 cDNA or gene sequences. To enhance 
amplification specificity, oligonucleotide primers comprising at least 25, 30, 35, 40, 
45 or 50 consecutive nucleotides of these sequences may also be used. 

15 In addition, the SBP2 gene sequence may be obtained by PCR amplification 

using primers derived from the disclosed cDNA sequence to probe a genomic 
library or genomic DN A, or by probing a genomic DNA library using a labeled 
probe derived from the SBP2 cDNA sequence. Standard PCR amplification or 
hybridization methods may be used for these approaches. 



20 



Example Two : Isolation of homologous gene sequence 

from other plant species 



With the provision herein of the soybean SBP2 cDNA, SBP 5 5 regulatory 
25 regions, and the disclosed discovery that modification of SBP proteins, particularly 
truncation of the C-terminus, produces enhanced sucrose uptake, the invention also 
enables the production of corresponding molecules from other plant species. Thus, 
the present invention permits the isolation of SBP2 homologs from other species, as 
well as the production of enhanced efficiency SBP proteins of other plant species. 
30 Both conventional hybridization and PCR amplification procedures may be utilized 
to obtain corresponding cDNAs from other species and to produce nucleic acids 
encoding enhanced activity SBP proteins. Common to both of these techniques is 
the hybridization of probes or primers derived from the SBP2 cDNA or gene 
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sequence to a target nucleotide preparation, which may be, in the case of 
conventional hybridization approaches, a cDNA or genomic library or, in the case of 
PCR amplification, a cDNA or genomic library, or an mRNA preparation. 

Direct PCR amplification may be performed on cDNA or genomic libraries 
5 prepared from the plant species in question, or RT-PCR may be performed using 
mRNA extracted from the plant cells using standard methods. PCR primers will 
comprise at least 15 consecutive nucleotides of the SBP2 cDNA. One of skill in the 
art will appreciate that sequence differences between the soybean SBP2 cDNA and 
the target nucleic acid to be amplified may result in lower amplification efficiencies. 

10 To compensate for this, longer PCR primers or lower annealing temperatures may 
be used during the amplification cycle. Where lower annealing temperatures are 
used, sequential rounds of amplification using nested primer pairs may be necessary 
to enhance specificity. 

For conventional hybridization, the hybridization probe is preferably 

15 conjugated with a detectable label such as a radioactive label, and the probe is 
preferably of at least 20 nucleotides in length. As is well known in the art, 
increasing the length of hybridization probes tends to give enhanced specificity. 
The labeled probe derived from the soybean SBP2 cDN A or gene sequence may be 
hybridized to a plant cDNA or genomic library and the hybridization signal detected 

20 using means known in the art. The hybridizing colony or plaque (depending on the 
type of library used) is then purified and the cloned sequence contained in that 
colony or plaque isolated and characterized. 

Homologs of the soybean SBP2 cDNA may alternatively be obtained by 
immunoscreening of an expression library. With the provision herein of the 

25 disclosed SBP2 nucleic acid sequences, the enzyme may be expressed and purified 
in a heterologous expression system (e.g., E. coli) and used to raise antibodies 
(monoclonal or polyclonal) specific for the SBP2 protein. Antibodies may also be 
raised against synthetic peptides derived from the SBP2 amino acid sequence 
presented herein. Methods of raising antibodies are well known in the art and are 

30 described in Harlow and Lane (1988). Such antibodies can then be used to screen 
an expression cDNA library produced from the plant from which it is desired to 
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clone the SBP2 ortholog, using the methods described above. The selected cDNAs 
can be confirmed by sequencing and enzyme activity. 

The soybean SBP2 gene or cDNA, and homologs of these sequences from 
other plants may be incorporated into transformation vectors and introduced into 
plants to modify SBP activity in such plants, as described in Example Three below. 
In addition, nucleic acids encoding modified SBP proteins as taught herein may also 
be used to produce plants having modified sucrose uptake activity. It is anticipated 
that the native SBP gene promoter may be particularly useful in the practice of the 
present invention in that it may be used to drive the expression of SBP transgenes, 
such as antisense constructs. By using the native SBP gene promoter, expression of 
these transgenes may be regulated in coordination with the native SBP gene (for 
example, in the same temporal or tissue-specific expression patterns). 

Example Three: Transgenic plants having modified sucrose uptake activity 

Once a gene (or cDNA) encoding a protein involved in the determination of 
a particular plant characteristic has been isolated, standard techniques may be used 
to express the cDNA in transgenic plants in order to modify that particular plant 
characteristic. The basic approach is to clone the cDNA into a transformation 
vector, such that it is operably linked to control sequences (e.g., a promoter) that 
direct expression of the cDNA in plant cells. The transformation vector is then 
introduced into plant cells by one of a number of techniques (e.g., electroporation) 
and progeny plants containing the introduced cDNA are selected. Preferably all or 
part of the transformation vector will stably integrate into the genome of the plant 
cell. That part of the transformation vector which integrates into the plant cell and 
which contains the introduced cDNA and associated sequences for controlling 
expression (the introduced "transgene") may be referred to as the recombinant 
expression cassette. 

Selection of progeny plants containing the introduced transgene may be made 
based upon the detection of an altered phenotype. Such a phenotype may result 
directly from the cDNA cloned into the transformation vector or may be manifested 
as enhanced resistance to a chemical agent (such as an antibiotic) as a result of the 
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inclusion of a dominant selectable marker gene incorporated into the transformation 
vector. 

The choice of (a) control sequences and (b) how the cDNA (or selected 
portions of the cDNA) are arranged in the transformation vector relative to the 
5 control sequences determine, in part, how the plant characteristic affected by the 
introduced cDNA is modified. For example, the control sequences may be tissue 
specific, such that the cDNA is only expressed in particular tissues of the plant (e.g., 
pollen, seed) and so the affected characteristic will be modified only in those tissues. 
The cDNA sequence may be arranged relative to the control sequence such that the 
10 cDNA transcript is expressed normally, or in an antisense orientation. Expression 

of an antisense RNA corresponding to the cloned cDNA will result in a reduction of 
the targeted gene product (the targeted gene product being the protein encoded by 
the plant gene from which the introduced cDNA was derived). Over-expression of 
the introduced cDNA, resulting from a plus-sense orientation of the cDNA relative 
15 to the control sequences in the vector, may lead to an increase in the level of the 

gene product, or may result in co-suppression (also termed "sense suppression") of 
that gene product. 

Successful examples of the modification of plant characteristics by 
transformation with cloned cDNA sequences are replete in the technical and 
20 scientific literature. Selected examples, which serve to illustrate the current 
knowledge in this field of technology, and which are herein incorporated by 
reference, include: 

U.S. Patent No. 5,451,514 to Boudet (modification of lignin synthesis using 
antisense RNA and co-suppression); 
25 U.S. Patent No. 5,443,974 to Hitz (modification of saturated and unsaturated 

fatty acid levels using antisense RNA and co-suppression); 

U.S. Patent No. 5,530,192 to Murase (modification of amino acid and fatty 
acid composition using antisense RNA); 

U.S. Patent No. 5,455,167 to Voelker (modification of medium chain fatty 
30 acids) 

U.S. Patent No. 5,231,020 to Jorgensen (modification of flavor, 
suppression); 
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U.S. Patent No. 5,583,021 to Dougherty (modification of virus resistance by 
expression of plus-sense untranslatable RNA); 

WO 96/13582 (modification of seed VLCFA composition using over 
expression, co-suppression and antisense RNA in conjunction with the Arabidopsis 
5 FAE1 gene); and 

WO 95/15387 (modification of seed VLCFA composition using over 
expression of jojoba wax synthesis gene). 

These examples include descriptions of transformation vector selection, 
transformation techniques and the construction of constructs designed to over- 
1 0 express the introduced cDN A or to express antisense RNA corresponding to the 
cDNA. In light of the foregoing and the provision herein of the SBP2 gene and 
nucleic acids encoding modified SBP proteins conferring enhanced sucrose uptake 
activity, it is thus apparent that one of skill in the art will be able to introduce these 
nucleic acids, or homologous or derivative forms of these molecules (e.g., antisense 
1 5 forms), into plants in order to produce plants having modified sucrose uptake 

activity activity, in developing seeds and other tissues. The result can be altered 
plant development with agricultural and economic consequences. 



a. Plant Types 

20 Nucleic acid molecules according to the present invention (e.g., the SBP2 

gene, nucleic acids encoding modified SBP proteins, homologs of these sequences 
and derivatives such as antisense forms) may be introduced into any plant type in 
order to modify sucrose uptake activity in the plant. Thus, the sequences of the 
present invention may be used to modify sucrose uptake activity in any higher plant, 

25 including monocotyledonous and dicotyledenous plants, including, but not limited 
to maize, wheat, rice, barley, soybean, beans in general, rape/canola, alfalfa, flax, 
sunflower, safflower, brassica, cotton, flax, peanut, clover; vegetables such as 
lettuce, tomato, cucurbits, potato, carrot, radish, pea, lentils, cabbage, broccoli, 
brussel sprouts, peppers; tree fruits such as apples, pears, peaches, apricots; flowers 

30 such as carnations and roses. 
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b. Vector Construction, Choice of Promoters 

A number of recombinant vectors suitable for stable transfection of plant cells 
or for the establishment of transgenic plants have been described including those 
described in Pouwels et aL, (1987), Weissbach and Weissbach, (1989), and Gelvin 
et aL, (1990). Typically, plant transformation vectors include one or more cloned 
plant genes (or cDNAs) under the transcriptional control of 5' and 3 ? regulatory 
sequences and a dominant selectable marker. Such plant transformation vectors 
typically also contain a promoter regulatory region (e.g., a regulatory region 
controlling inducible or constitutive, environmentally-or developmentally-regulated, 
or cell- or tissue-specific expression), a transcription initiation start site, a ribosome 
binding site, an RNA processing signal, a transcription termination site, and/or a 
polyadenylation signal. 

Examples of constitutive plant promoters which may be useful for expressing 
nucleic acids include: the cauliflower mosaic virus (CaMV) 35S promoter, which 
confers constitutive, high-level expression in most plant tissues (see, e.g., Odel et 
aL, 1985, Dekeyser et aL, 1990, Terada and Shimamoto, 1990; Benfey and Chua. 
1990); the nopaline synthase promoter (An et aL, 1988); and the octopine synthase 
promoter (Fromm et aL, 1989). 

A variety of plant gene promoters that are regulated in response to 
environmental, hormonal, chemical, and/or developmental signals, also can be used 
for expression of the cDNA in plant cells, including promoters regulated by: (a) heat 
(Callis et aL, 1988; Ainley, et al. 1993; Gilmartin et aL 1992); (b) light (e.g., the pea 
rbcS-3A promoter, Kuhlemeier et aL, 1989, and the maize rbcS promoter, Schaffner 
and Sheen, 1991); (c) hormones, such as abscisic acid (Marcotte et aL, 1989); (d) 
wounding (e.g., wunl, Siebertz et aL, 1989); and (e) chemicals such as methyl 
jasminate or salicylic acid (see also Gatz et aL, 1997) can also be used to regulate 
gene expression. 

Alternatively, tissue specific (root, leaf, flower, and seed for example) 
promoters (Carpenter et aL, 1992; Denis et aL, 1993; Opperman et aL, 1993; 
Stockhause et al. 1997; Roshal et aL, 1987; Schemthaner et aL, 1988; and Bustos et 
aL, 1989) can be fused to the coding sequence to obtained particular expression in 
respective organs. In addition, the timing of the expression can be controlled by 
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using promoters such as those acting at senescencing (Gan and Amasino, 1995) or 
late seed development (Odell et al., 1994). 

The promoter regions of the SBP1 and SBP2 genes disclosed herein confer 
developing seed-specific expression in soybean. Accordingly, these promoters may 
5 be used to obtain developing seed specific expression of the introduced transgene. 

Plant transformation vectors may also include RNA processing signals, for 
example, introns, which may be positioned upstream or downstream of the ORF 
sequence in the transgene. In addition, the expression vectors may also include 
additional regulatory sequences from the 3'-untranslated region of plant genes, e.g., 
10 a 3' terminator region to increase mRNA stability of the mRNA, such as the PI-II 
terminator region of potato or the octopine or nopaline synthase 3 1 terminator 
regions. 

Finally, as noted above, plant transformation vectors may also include 
dominant selectable marker genes to allow for the ready selection of transformants. 
15 Such genes include those encoding antibiotic resistance genes (e.g., resistance to 
hygromycin, kanamycin, bleomycin, G418, streptomycin or spectinomycin) and 
herbicide resistance genes (e.g., phosphinothricin acety transferase). 



c. Arrangement of SBP sequence in vector 

20 The particular arrangement of the SBP sequence in the transformation vector 

will be selected according to the type of expression of the sequence that is desired. 

Where enhanced sucrose uptake activity is desired in the plant, the SBP ORF 
may be operably linked to a constitutive high-level promoter such as the CaMV 35 S 
promoter. Modification of sucrose uptake activity may also be achieved by 

25 introducing into a plant a transformation vector containing a variant form of the 

SBP2 gene, for example a form which varies from the exact nucleotide sequence of 
the SBP2 ORF, but which encodes a protein that retains the functional characteristic 
of the SBP2 protein, i.e., conferring sucrose uptake activity. By way of example, 
enhanced sucrose uptake activity may also be obtained by utilizing a nucleic acid 

30 sequence encoding a modified SBP as discussed above. Such modified SBPs 

include SBPs having C-terminal deletions, generally in the range of 10-100 amino 
acid residue, and preferably about 80 amino acid residues. 
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In contrast, a reduction sucrose uptake activity in the transgenic plant may be 
obtained by introducing into plants antisense constructs based on a SBP gene 
sequence. For antisense suppression, SBP gene is arranged in reverse orientation 
relative to the promoter sequence in the transformation vector. The introduced 
5 sequence need not be the full length SBP gene, and need not be exactly homologous 
to the SBP gene found in the plant type to be transformed. Generally, however, 
where the introduced sequence is of shorter length, a higher degree of homology to 
the native SBP sequence will be needed for effective antisense suppression. 
Preferably, the introduced antisense sequence in the vector will be at least 30 
10 nucleotides in length, and improved antisense suppression will typically be observed 
as the length of the antisense sequence increases. Preferably, the length of the 
antisense sequence in the vector will be greater than 100 nucleotides. Transcription 
of an antisense construct as described results in the production of RNA molecules 
that are the reverse complement of mRNA molecules transcribed from the 
15 endogenous SBP gene in the plant cell. Although the exact mechanism by which 
antisense RNA molecules interfere with gene expression has not been elucidated., h 
is believed that antisense RNA molecules bind to the endogenous mRNA molecules 
and thereby inhibit translation of the endogenous mRNA. 

Suppression of endogenous SBP gene expression can also be achieved using 
20 ribozymes. Ribozymes are synthetic RNA molecules that possess highly specific 
endoribonuclease activity. The production and use of ribozymes are disclosed in 
U.S. Patent No. 4,987,071 to Cech and U.S. Patent No. 5,543,508 to Haselhoff. The 
inclusion of ribozyme sequences within antisense RNAs may be used to confer 
RNA cleaving activity on the antisense RNA, such that endogenous mRNA 
25 molecules that bind to the antisense RNA are cleaved, which in turn leads to an 
enhanced antisense inhibition of endogenous gene expression. 

Constructs in which a SBP nucleic acid (or variants thereof) are over- 
expressed may also be used to obtain co-suppression of the endogenous SBP gene in 
the manner described in U.S. Patent No. 5,23 1,021 to Jorgensen. Such co- 
30 suppression (also termed sense suppression) does not require that the SBP gene be 
introduced into the plant cells, nor does it require that the introduced sequence be 
exactly identical to the endogenous SBP gene. However, as with antisense 
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suppression, the suppressive efficiency will be enhanced as (1) the introduced 
sequence is lengthened and (2) the sequence similarity between the introduced 
sequence and the endogenous SBP gene is increased. 

Constructs expressing an untranslatable form of a SBP mRNA may also be 
5 used to suppress the expression of endogenous SBP activity. Methods for 

producing such constructs are described in U.S. Patent No. 5,583,021 to Dougherty 
et al. Preferably, such constructs are made by introducing a premature stop codon 
into the SBP ORF. 

Finally, dominant negative mutant forms of the disclosed sequences may be 
1 0 used to block endogenous SBP activity. Such mutants require the production of 

mutated forms of the SBP protein that bind to sucrose but do not catalyze the uptake 
of sucrose. 



d. Transformation and Regeneration Techniques 

1 5 Transformation and regeneration of both monocoty ledonous and 

dicotyledonous plant cells is now routine, and the selection of the most appropriate 
transformation technique will be determined by the practitioner. The choice of 
method will vary with the type of plant to be transformed; those skilled in the art 
will recognize the suitability of particular methods for given plant types. Suitable 

20 methods may include, but are not limited to: electroporation of plant protoplasts; 
liposome-mediated transformation; polyethylene glycol (PEG) mediated 
transformation; transformation using viruses; micro-injection of plant cells; micro- 
projectile bombardment of plant cells; vacuum infiltration; and Agrobacterium 
tumeficiens (AT) mediated transformation. Typical procedures for transforming and 

25 regenerating plants are described in the patent documents listed at the beginning of 
this section. 

e. Selection of Transformed Plants 

Following transformation and regeneration of plants with the transformation 
30 vector, transformed plants are preferably selected using a dominant selectable 

marker incorporated into the transformation vector. Typically, such a marker will 
confer antibiotic resistance on the seedlings of transformed plants, and selection of 
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transformants can be accomplished by exposing the seedlings to appropriate 
concentrations of the antibiotic. 

After transformed plants are selected and grown to maturity, they can be 
assayed using known methods to determine whether SBP activity has been altered 
5 as a result of the introduced transgene. In addition, antisense or sense suppression of 
an endogenous SBP gene may be detected by analyzing mRNA expression on 
Northern blots. 

Example Four: Production of sequence variants 

10 As noted above, modification of sucrose uptake activity in plant cells can be 

achieved by transforming plants with the SBP2 cDNA or gene, antisense constructs 
based on the SBP2 cDNA or gene sequence or nucleic acid sequences encoding 
modified SBP proteins. With the provision of the SBP 2 cDNA and gene sequences 
and the SBP 5' regulatory regions herein, the creation of variants on these sequences 

15 by standard mutagenesis techniques is now enabled. 

Variant DNA molecules include those created by standard DNA mutagenesis 
techniques, for example, Ml 3 primer mutagenesis. Details of these techniques are 
provided in Sambrook et al. (1989), Ch. 15. By the use of such techniques, variants 
may be created which differ in minor ways from the disclosed sequences disclosed. 

20 DNA molecules and nucleotide sequences which are derivatives of those 

specifically disclosed herein and which differ from those disclosed by the deletion, 
addition or substitution of nucleotides while still encoding a protein which possesses 
the functional characteristic of a SBP protein (i.e., the ability to mediate sucrose 
uptake in the yeast assay system) are comprehended by this invention. DNA 

25 molecules and nucleotide sequences which are derived from the SBP2 cDNA and 
gene sequences disclosed include DNA sequences which hybridize under stringent 
conditions to the DNA sequences disclosed, or fragments thereof. 

Hybridization conditions resulting in particular degrees of stringency will vary 
depending upon the nature of the hybridization method of choice and the 

30 composition and length of the hybridizing DNA used. Generally, the temperature of 
hybridization and the ionic strength (especially the Na + concentration) of the 
hybridization buffer will determine the stringency of hybridization. Calculations 
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regarding hybridization conditions required for attaining particular degrees of 
stringency are discussed by Sambrook et al. (1989), chapters 9 and 11, herein 
incorporated by reference. By way of illustration only, a hybridization experiment 
may be performed by hybridization of a DNA molecule (for example, soybean SBP2 
5 cDNA sequence) to a target DNA molecule (for example, the a corresponding SBP2 
cDNA sequence in tobacco) which has been electrophoresed in an agarose gel and 
transferred to a nitrocellulose membrane by Southern blotting (Southern, 1975), a 
technique well known in the art and described in (Sambrook et al., 1989). 
Hybridization with a target probe labeled with [ 32 P]dCTP is generally carried out in 
1 0 a solution of high ionic strength such as 6xSSC at a temperature that is 20-25° C 
below the melting temperature, T m , described below. For such Southern 
hybridization experiments where the target DNA molecule on the Southern blot 
contains 10 ng of DNA or more, hybridization is typically carried out for 68 hours 
using 12 ng/ml radiolabeled probe (of specific activity equal to 10 9 CPM/|ig or 
1 5 greater). Following hybridization, the nitrocellulose filter is washed to remove 
background hybridization. The washing conditions should be as stringent as 
possible to remove background hybridization but to retain a specific hybridization 
signal. The term T m represents the temperature above which, under the prevailing 
ionic conditions, the radiolabeled probe molecule will not hybridize to its target 
20 DNA molecule. The T m of such a hybrid molecule may be estimated from the 
following equation (Bolton and McCarthy, 1962): 

T m = 81.5C 16.6(log 10 [Na + ]) + 0.41(%G+C)-0.63( o /oformamide) (600/0 

25 Where / = the length of the hybrid in base pairs. 

This equation is valid for concentrations of Na + in the range of 0.01 M to 0.4 M, and 
it is less accurate for calculations of T m in solutions of higher [Na + ]. The equation is 
also primarily valid for DNAs whose G+C content is in the range of 30% to 75%, 
and it applies to hybrids greater than 100 nucleotides in length (the behavior of 

30 oligonucleotide probes is described in detail in Ch. 1 1 of Sambrook et al., 1989). 

Thus, by way of example, for a 150 base pair DNA probe derived from the 
first 150 base pairs of the open reading frame of the soybean SBP2 cDNA (with a 
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hypothetical %GC = 45%), a calculation of hybridization conditions required to give 
particular stringencies may be made as follows: 

For this example, it is assumed that the filter will be washed in 0.3 xSSC 
solution following hybridization, thereby [Na + ] = 0.045M; %GC = 45%; 
5 Formamide concentration = 0; / — 150 base pairs; and 7^ = 81.5 16(log I0 [Na + ]) + 
(0.41 x 45) (600/150) and so T m = 74 A C. 

The T m of double-stranded DNA decreases by 1-1.5 °C with every 1% decrease 
in homology (Bonner et al., 1973). Therefore, for this given example, washing the 
filter in 0.3 xSSC at 59.4-64.4 °C will produce a stringency of hybridization 

10 equivalent to 90%. Alternatively, washing the hybridized filter in 0.3 xSSC at a 
temperature of 65.4-68.4 °C will yield a hybridization stringency of 94%. The 
above example is given entirely by way of theoretical illustration. One skilled in the 
art will appreciate that other hybridization techniques may be utilized and that 
variations in experimental conditions will necessitate alternative calculations for 

15 : stringency. 

DNA sequences from plants that encode a protein having SBP activity and 
which hybridize under hybridization conditions of at least 75%, more preferably at 
least 80%, more preferably at least 85%, more preferably at least 90% and most 
preferably at least 95% stringency to the disclosed SBP2 sequence are encompassed 

20 within the present invention. 

The degeneracy of the genetic code further widens the scope of the present 
invention as it enables major variations in the nucleotide sequence of a DNA 
molecule while maintaining the amino acid sequence of the encoded protein. For 
example, the second amino acid residue of the soybean SBP2 protein is alanine. 

25 This is encoded in the soybean SBP2 open reading frame by the nucleotide codon 
triplet GCG. Because of the degeneracy of the genetic code, three other nucleotide 
codon triplets-GCA, GCC and GCT-also code for alanine. Thus, the nucleotide 
sequence of the soybean SBP2 ORF could be changed at this position to any of 
these three codons without affecting the amino acid composition of the encoded 

30 protein or the characteristics of the protein. Based upon the degeneracy of the 
genetic code, variant DNA molecules may be derived from the cDNA and gene 
sequences disclosed herein using standard DNA mutagenesis techniques as 



SUBSTITUTE SHEET (RULE 26) 




WO 98/53086 PCT/US98/1 0465 

-28- 

described above, or by synthesis of DNA sequences. Thus, this invention also 
encompasses nucleic acid sequences which encode a SBP protein but which vary 
from the disclosed nucleic acid sequences by virtue of the degeneracy of the genetic 
code. 

5 The present invention teaches that enhanced sucrose uptake activity may be 

obtained by modifying the sequence of a plant SBP, e.g., by deleting 80 C-terminal 
amino acids. One skilled in the art will recognize that DNA mutagenesis techniques 
may be used not only to produce variant DNA molecules, but will also facilitate the 
production of such modified SBP protein. In addition, other changes to the amino 

10 acid sequence can be made including deletions, additions and substitutions. 

While the site for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in 
order to optimize the performance of a mutation at a given site, random mutagenesis 
may be conducted at the target codon or region and the expressed protein variants 

15 screened for the optimal combination of desired activity. Techniques for making 
substitution mutations at predetermined sites in DNA having a known sequence as 
described above are well known. 

Amino acid substitutions are typically of single residues; insertions usually 
will be on the order of about from 1 to 10 amino acid residues; and deletions will 

20 range about from 1 to more than 100 residues. Substitutions, deletions, insertions or 
any combination thereof may be combined to arrive at a final construct. Obviously, 
the mutations that Eire made in the DNA encoding the protein must not place the 
sequence out of reading frame and preferably will not create complementary regions 
that could produce secondary mRNA structure. 

25 Substitutional variants are those in which at least one residue in the amino acid 

sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the following Table 1 when it is 
desired to finely modulate the characteristics of the protein. Table 1 shows amino 
acids which may be substituted for an original amino acid in a protein and which are 

30 regarded as conservative substitutions. 
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Table 1. 



10 



15 



20 



Original Residue 


Conservative Substitu 


Ala 


ser 


Arg 


lys 


Asn 


gin; his 


Asp 


glu 


Cys 


ser 


Gin 


asn 


Glu 


asp 


Gly 


pro 


His 


asn; gin 


He 


leu, val 


Leu 


ile; val 


Lys 


arg; gin; glu 


Met 


leu; ile 


Phe 


met; leu; tyr 


Ser 


thr 


Thr 


ser 


Trp 


tyr 


Tyr 


trp; phe 


Val 


ile; leu 



Substantial changes in enzymatic function or other features are made by 
selecting substitutions that are less conservative than those in Table 1, i.e., selecting 

25 residues that differ more significantly in their effect on maintaining (a) the structure 
of the polypeptide backbone in the area of the substitution, for example, 
or helical conformation, (b) the charge or hydrophobicity of the molecule at the 
target site, or (c) the bulk of the side chain. The substitutions which in general are 
expected to produce the greatest changes in protein properties will be those in which 

30 (a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a 

hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a 
cysteine or proline is substituted for (or by) any other residue; (c) a residue having 
an electropositive side chain, e.g., lysyl, arginyl, or histadyl, is substituted for (or 
by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a 

35 bulky side chain, e.g., phenylalanine, is substituted for (or by) one not having a side 
chain, e.g., glycine. 

The effects of these amino acid substitutions or deletions or additions 
may be assessed for derivatives of the SBP proteins by analyzing the ability of the 
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derivative proteins to catalyze sucrose uptake in the yeast assay system described 
above. 

Example Five: Use of SBP 5' regulatory regions 
5 to control transgene expression 

The promoters of the Glycine SBP] and SBP2 genes confer developing seed- 
specific expression. Accordingly, the promoter sequences, shown in Seq. I.D. Nos. 
7 (SBP 2) and 8 (SBP]) may be used to produce transgene constructs that are 

10 specifically expressed in developing seeds. One of skill in the art will recognize that 
regulation of transgene expression in developing seeds may be achieved with less 
than the entire 5* regulatory sequences shown in Seq. I.D. Nos. 7 & 8. Thus, by 
way of example, developing seed-specific expression may be obtained by 
employing a 50 base pair or 100 base pair region of the disclosed promoter 

15 sequences. The determination of whether a particular sub-region of the disclosed 

sequence operates to confer effective seed-specific expression in a particular system 
(taking into account the plant species into which the construct is being introduced, 
the level of expression required, etc.) will be performed using known methods, such 
as operably linking the promoter sub-region to a marker gene (e.g. GUS), 

20 introducing such constructs into plants and then determining the level of expression 
of the marker gene in developing seeds and other plant tissues. 

The present invention therefore facilitates the production, by standard 
molecular biology techniques, of nucleic acid molecules comprising the SBP I or 
SBP2 promoter sequence operably linked to a nucleic acid sequence, such as an 

25 open reading frame. Suitable open reading frames include open reading frames 
encoding any protein for which expression in developing seeds is desired. 
Examples of genes that may suitably be expressed in a seed-specific manner under 
the control of the disclosed SBP promoters include, but are not limited to: 

(1) genes that enhance the nutritional quality of the seeds, for example, by 

30 increasing the content of limiting amino acids, including lysine, methionine and 
cysteine. This may be achieved by expressing proteins containing high levels of 
these amino acids in seeds. Examples include the high methionine storage proteins 
from brazil nut (Saalbach et aL, 1996) and sunflower (Molvig et al., 1997). 



SUBSTITUTE SHEET (RULE 26) 



# 




WO 98/53086 



PCT/US98/10465 



-31- 



(2) genes that increase gluten levels in wheat, so as to enhance the bread- 
making quality of the wheat flour (Shewry et al., 1995). 

(3) genes that enhance insect resistance in the seed (for example, resistance to 
weevils). Suitable genes include the a-amylase inhibitor gene which kills seed 

5 weevils (Schmidt, 1994). 
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SEQUENCE LISTING 



( 1 ) GENERAL INFORMATION 
(i) APPLICANT: Grimes 
5 (ii) TITLE OF INVENTION: Sucrose binding proteins 

(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Klarquist Sparkman Campbell Leigh & 
Whinston, LLP 

10 (B) STREET: One World Trade Center 

121 S.W. Salmon Street 
Suite 1600 

(C) CITY: Portland 

(D) STATE: Oregon 

15 (E) COUNTRY: United States of America 

(F) ZIP: 97204-2988 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Disk, 3-1/2 inch 

(B) COMPUTER: IBM PC compatible 
20 (C) OPERATING SYSTEM: Windows NT 

(D) SOFTWARE: Word 97 & ASCII 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

25 (C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/047,568 

(B) FILING DATE: May 22, 1997 

(C) CLASSIFICATION: 

30 (viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: David J. Earp 

(B) REGISTRATION NUMBER: 41,401 

(C) REFERENCE/DOCKET NUMBER: 4630-50206/DJE 
(ix) TELECOMMUNICATION INFORMATION: 

35 (A) TELEPHONE: (503) 226-7391 

(B) TELEFAX: (503) 228-9446 

(2) INFORMATION FOR SEQ ID NO: 1 : 
(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 524 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
45 Met Gly Met Arg Thr Lys Leu Ser Leu Ala lie Phe Phe Phe Phe 

5 10 15 



50 



Leu Leu Ala Leu Phe Ser Asn Leu Ala Phe Gly Lys Cys Lys Glu 
20 25 30 

Thr Glu Val Glu Glu Glu Asp Pro Glu Leu Val Thr Cys Lys His 
35 40 45 



Gin Cys Gin Gin Gin Gin Gin Tyr Thr Glu Gly Asp Lys Arg Val 
55 50 55 60 

Cys Leu Gin Ser Cys Asp Arg Tyr His Arg Met Lys Gin Glu Arg 
65 70 75 

60 Glu Lys Gin lie Gin Glu Glu Thr Arg Glu Lys Lys Glu Glu Glu 

80 85 90 
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Ser Arg Glu Arg Glu Glu Glu Gin Gin Glu Gin His Glu Glu Gin 

95 100 105 

Asp Glu Asn Pro Tyr He Phe Glu Glu Asp Lys Asp Phe Glu Thr 

110 115 120 

Arg Val Glu Thr Glu Gly Gly Arg He Arg Val Leu Lys Lys Phe 

125 130 135 

Thr Glu Lys Ser Lys Leu Leu Gin Gly He Glu Asn Phe Arg Leu 

140 145 150 

Ala He Leu Glu Ala Arg Ala His Thr Phe Val Ser Pro Arg His 

155 160 165 

Phe Asp Ser Glu Val Val Phe Phe Asn He Lys Gly Arg Ala Val 

170 175 180 

20 Leu Gly Leu Val Ser Glu Ser Glu Thr Glu Lys He Thr Leu Glu 

185 190 195 



10 



15 



Pro Gly Asp Met He His He Pro Ala Gly Thr Pro Leu Tyr He 

200 205 210 

Val Asn Arg Asp Glu Asn Asp Lys Leu Phe Leu Ala Met Leu His 

215 220 225 

He Pro Val Ser Val Ser Thr Pro Gly Lys Phe Glu Glu Phe Phe 

230 235 240 

Ala Pro Gly Gly Arg Asp Pro Glu Ser Val Leu Ser Ala Phe Ser 

245 250 255 

Trp Asn Val Leu Gin Ala Ala Leu Gin Thr Pro Lys Gly Lys Leu 

260 265 270 

Glu Asn Val Phe Asp Gin Gin Asn Glu Gly Ser He Phe Arg He 

275 280 285 

Ser Arg Glu Gin Val Arg Ala Leu Ala Pro Thr Lys Lys Ser Ser 

290 295 300 

Trp Trp Pro Phe Gly Gly Glu Ser Lys Pro Gin Phe Asn He Phe 

305 310 315 

Ser Lys Arg Pro Thr He Ser Asn Gly Tyr Gly Arg Leu Thr Glu 

320 325 330 

Val Gly Pro Asp Asp Asp Glu Lys Ser Trp Leu Gin Arg Leu Asn 

335 340 345 

Leu Met Leu Thr Phe Thr Asn He Thr Gin Arg Ser Met Ser Thr 

350 355 360 

He His Tyr Asn Ser His Ala Thr Lys He Ala Leu Val He Asp 

365 370 375 

Gly Arg Gly His Leu Gin He Ser Cys Pro His Met Ser Ser Arg 

380 385 390 

Ser Ser His Ser Lys His Asp Lys Ser Ser Pro Ser Tyr His Arg 

395 400 40S 

65 He Ser Ser Asp Leu Lys Pro Gly Met Val Phe Val Val Pro Pro 

410 415 420 



25 



30 



35 



40 



45 



50 



55 



60 



70 



Gly His Pro Phe Val Thr He Ala Ser Asn Lys Glu Asn Leu Leu 

425 430 435 

Met He Cys Phe Glu Val Asn Ala Arg Asp Asn Lys Lys Phe Thr 

440 445 450 
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Phe Ala Gly Lys Asp Asn lie Val Ser Ser Leu Asp Asn Val Ala 
455 4S0 465 

Lys Glu Leu Ala Phe Asn Tyr Pro Ser Glu Met Val Asn Gly Val 
5 470 475 480 

Phe Leu Leu Gin Arg Phe Leu Glu Arg Lys Leu lie Gly Arg Leu 
485 490 495 

10 Tyr His Leu Pro His Lys Asp Arg Lys Glu Ser Phe Phe Phe Pro 

500 505 510 



15 



25 



40 



55 



70 



Phe Glu Leu Pro Arg Glu Glu Arg Gly Arg Arg Ala Asp Ala 
515 520 



(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 444 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQIDNO:2: 

Met Gly Met Arg Thr Lys Leu Ser Leu Ala lie Phe Phe Phe Phe 
5 10 15 



Leu Leu Ala Leu Phe Ser Asn Leu Ala Phe Gly Lys Cys Lys Glu 
20 25 30 



Thr Glu Val Glu Glu Glu Asp Pro Glu Leu Val Thr Cys Lys His 
30 35 40 45 

Gin Cys Gin Gin Gin Gin Gin Tyr Thr Glu Gly Asp Lys Arg Val 
50 55 60 

35 Cys Leu Gin Ser Cys Asp Arg Tyr His Arg Met Lys Gin Glu Arg 

65 70 75 



Glu Lys Gin lie Gin Glu Glu Thr Arg Glu Lys Lys Glu Glu Glu 
80 85 90 

Ser Arg Glu Arg Glu Glu Glu Gin Gin Glu Gin His Glu Glu Gin 
95 100 105 



Asp Glu Asn Pro Tyr lie Phe Glu Glu Asp Lys Asp Phe Glu Thr 

45 110 115 120 

Arg Val Glu Thr Glu Gly Gly Arg lie Arg Val Leu Lys Lys Phe 

125 130 135 

50 Thr Glu Lys Ser Lys Leu Leu Gin Gly lie Glu Asn Phe Arg Leu 

140 145 150 



Ala lie Leu Glu Ala Arg Ala His Thr Phe Val Ser Pro Arg His 

155 160 165 

Phe Asp Ser Glu Val Val Phe Phe Asn lie Lys Gly Arg Ala Val 

170 175 180 



Leu Gly Leu Val Ser Glu Ser Glu Thr Glu Lys lie Thr Leu Glu 

60 185 190 195 

Pro Gly Asp Met lie His lie Pro Ala Gly Thr Pro Leu Tyr lie 

200 205 210 

65 Val Asn Arg Asp Glu Asn Asp Lys Leu Phe Leu Ala Met Leu His 

215 220 225 



lie Pro Val Ser Val Ser Thr Pro Gly Lys Phe Glu Glu Phe Phe 

230 235 240 

Ala Pro Gly Gly Arg Asp Pro Glu Ser Val Leu Ser Ala Phe Ser 

24S 2S0 255 
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Trp Asn Val Leu Gin Ala Ala Leu Gin Thr Pro Lys Gly Lys Leu 
260 265 270 

Glu Asn Val Phe Asp Gin Gin Asn Glu Gly Ser lie Phe Arg lie 
5 275 280 285 



10 



15 



20 



30 



35 



50 



55 



60 



65 



70 



Ser Arg Glu Gin Val Arg Ala Leu Ala Pro Thr Lys Lys Ser Ser 

290 295 300 

Trp Trp Pro Phe Gly Gly Glu Ser Lys Pro Gin Phe Asn lie Phe 

305 310 315 

Ser Lys Arg Pro Thr He Ser Asn Gly Tyr Gly Arg Leu Thr Glu 

320 325 330 

Val Gly Pro Asp Asp Asp Glu Lys Ser Trp Leu Gin Arg Leu Asn 

335 340 345 

Leu Met Leu Thr Phe Thr Asn He Thr Gin Arg Ser Met Ser Thr 

350 355 360 

He His Tyr Asn Ser His Ala Thr Lys lie Ala Leu Val lie Asp 

365 370 375 



25 Gly Arg Gly His Leu Gin He Ser Cys Pro His Met Ser Ser Arg 

380 385 390 



Ser Ser His Ser Lys His Asp Lys Ser Ser Pro Ser Tyr His Arg 
395 400 405 

He Ser Ser Asp Leu Lys Pro Gly Met Val Phe Val Val Pro Pro 
410 415 420 

Gly His Pro Phe Val Thr He Ala Ser Asn Lys Glu Asn Leu Leu 
425 430 435 

Met He Cys Phe Glu Val Asn Ala Arg 
440 



40 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 489 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:3: 

Met Ala Thr Arg Ala Lys Leu Ser Leu Ala He Phe Leu Phe Phe 
S 10 15 



Leu Leu Ala Leu He Ser Asn Leu Ala Leu Gly Lys Leu Lys Glu 
20 25 30 

Thr Glu Val Glu Glu Asp Pro Glu Leu Val Thr Cys Lys His Gin 
35 40 45 

Cys Gin Gin Gin Arg Gin Tyr Thr Glu Ser Asp Lys Arg Thr Cys 
50 55 60 

Leu Gin Gin Cys Asp Ser Met Lys Gin Glu Arg Glu Lys Gin Val 
65 70 75 

Glu Glu Glu Thr Arg Glu Lys Glu Glu Glu His Gin Glu Gin His 
80 85 90 

Glu Glu Glu Glu Asp Glu Asn Pro Tyr Val Phe Glu Glu Asp Lys 
95 100 105 

Asp Phe Ser Thr Arg Val Glu Thr Glu Gly Gly Ser He Arg Val 
HO 115 120 

Leu Lys Lys Phe Thr Glu Lys Ser Lys Leu Leu Gin Gly He Glu 
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125 130 135 

Asn Phe Arg Leu Ala lie Leu Glu Ala Arg Ala His Thr Phe Val 

140 145 ISO 

5 

Ser Pro Arg His Phe Asp Ser Glu Val Val Leu Phe Asn lie Lys 

155 160 165 

Gly Arg Ala Val Leu Gly Leu Val Arg Glu Ser Glu Thr Glu Lys 
10 170 175 180 

lie Thr Leu Glu Pro Gly Asp Met lie His lie Pro Ala Gly Thr 
185 190 195 

15 Pro Leu Tyr lie Val Asn Arg Asp Glu Asn Glu Lys Leu Leu Leu 

200 205 210 



20 



35 



50 



65 



Ala Met Leu His lie Pro Val Ser Thr Pro Gly Lys Phe Glu Glu 

215 220 225 

Phe Phe Gly Pro Gly Gly Arg Asp Pro Glu Ser Val Leu Ser Ala 

230 235 240 



Phe Ser Trp Asn Val Leu Gin Ala Ala Leu Gin Thr Pro Lys Gly 

25 245 250 255 

Lys Leu Glu Arg Leu Phe Asn Gin Gin Asn Glu Gly Ser lie Phe 

260 265 270 

30 Lys lie Ser Arg Glu Arg Val Arg Ala Leu Ala Pro Thr Lys Lys 

275 280 285 



Ser Ser Trp Trp Pro Phe Gly Gly Glu Ser Lys Ala Gin Phe Asn 

290 295 300 

He Phe Ser Lys Arg Pro Thr Phe Ser Asn Gly Tyr Gly Arg Leu 

305 310 315 



Thr Glu Val Gly Pro Asp Asp Glu Lys Ser Trp Leu Gin Arg Leu 
40 320 325 330 

Asn Leu Met Leu Thr Phe Thr Asn He Thr Gin Arg Ser Met Ser 
335 340 345 

45 Thr He His Tyr Asn Ser His Ala Thr Lys He Ala Leu Val Met 

350 355 360 



Asp Gly Arg Gly His Leu Gin He Ser Cys Pro His Met Ser Ser 
365 370 375 

Arg Ser Asp Ser Lys His Asp Lys Ser Ser Pro Ser Tyr His Arg 
380 385 390 



He Ser Ala Asp Leu Lys Pro Gly Met Val Phe Val Val Pro Pro 

55 395 400 405 

Gly His Pro Phe Val Thr He Ala Ser Asn Lys Glu Asn Leu Leu 

410 415 420 

60 He He Cys Phe Glu Val Asn Val Arg Asp Asn Lys Lys Phe Thr 

425 430 435 



Phe Ala Gly Lys Asp Asn He Val Ser Ser Leu Asp Asn Val Ala 
440 445 450 

Lys Glu Leu Ala Phe Asn Tyr Pro Ser Glu Met Val Asn Gly Val 
455 460 465 



Ser Glu Arg Lys Glu Ser Leu Phe Phe Pro Phe Glu Leu Pro Ser 
70 470 475 480 

Glu Glu Arg Gly Arg Arg Ala Val Ala 
485 
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(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 409 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Met Ala Thr Arg Ala Lys Leu Ser Leu Ala lie Phe Leu Phe Phe 
5 10 15 

10 

Leu Leu Ala Leu lie Ser Asn Leu Ala Leu Gly Lys Leu Lys Glu 
20 25 30 

Thr Glu Val Glu Glu Asp Pro Glu Leu Val Thr Cys Lys His Gin 
35 40 45 

Cys Gin Gin Gin Arg Gin Tyr Thr Glu Ser Asp Lys Arg Thr Cys 
50 55 60 

20 Leu Gin Gin Cys Asp Ser Met Lys Gin Glu Arg Glu Lys Gin Val 

65 70 7S 



15 



25 



30 



40 



45 



55 



60 



65 



70 



Glu Glu Glu Thr Arg Glu Lys Glu Glu Glu His Gin Glu Gin His 
80 85 90 

Glu Glu Glu Glu Asp Glu Asn Pro Tyr Val Phe Glu Glu Asp Lys 
95 100 105 

Asp Phe Ser Thr Arg Val Glu Thr Glu Gly Gly Ser He Arg Val 
110 115 120 

Leu Lys Lys Phe Thr Glu Lys Ser Lys Leu Leu Gin Gly He Glu 
125 130 135 



35 Asn Phe Arg Leu Ala lie Leu Glu Ala Arg Ala His Thr Phe Val 

140 145 150 



Ser Pro Arg His Phe Asp Ser Glu Val Val Leu Phe Asn lie Lys 

155 160 165 

Gly Arg Ala Val Leu Gly Leu Val Arg Glu Ser Glu Thr Glu Lys 

170 175 180 

lie Thr Leu Glu Pro Gly Asp Met lie His He Pro Ala Gly Thr 

185 190 195 

Pro Leu Tyr He Val Asn Arg Asp Glu Asn Glu Lys Leu Leu Leu 

200 20S 210 



50 Ala Met Leu His He Pro Val Ser Thr Pro Gly Lys Phe Glu Glu 

215 220 225 



Phe Phe Gly Pro Gly Gly Arg Asp Pro Glu Ser Val Leu Ser Ala 
230 235 240 

Phe Ser Trp Asn Val Leu Gin Ala Ala Leu Gin Thr Pro Ly3 Gly 
245 250 255 

Lys Leu Glu Arg Leu Phe Asn Gin Gin Asn Glu Gly Ser lie Phe 
260 265 270 

Lys He Ser Arg Glu Arg Val Arg Ala Leu Ala Pro Thr Lys Lys 
275 280 285 

Ser Ser Trp Trp Pro Phe Gly Gly Glu Ser Lys Ala Gin Phe Asn 
290 295 ' 300 

He Phe Ser Lys Arg Pro Thr Phe Ser Asn Gly Tyr Gly Arg Leu 
305 310 315 

Thr Glu Val Gly Pro Asp Asp Glu Lys Ser Trp Leu Gin Arg Leu 
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320 325 330 

Asn Leu Met Leu Thr Phe Thr Asn lie Thr Gin Arg Ser Met Ser 

335 340 345 

5 

Thr lie His Tyr Asn Ser His Ala Thr Lys lie Ala Leu Val Met 

350 355 360 

Asp Gly Arg Gly His Leu Gin lie Ser Cys Pro His Met Ser Ser 
10 365 370 375 

Arg Ser Asp Ser Lys His Asp Lys Ser Ser Pro Ser Tyr His Arg 
380 3B5 390 

15 lie Ser Ala Asp Leu Lys Pro Gly Met Val Phe Val Val Pro Pro 

395 400 405 

Gly His Pro Phe 

20 (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1924 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:5: 

TGTAAAACGA CGGCCAGTGA ATTG TAATAC G ACT CACTAT AGGGCGAATT 50 
GGGTACCGGG CCCCCCCTCG AGGTCGACGG TATCGATAAG CTTGATTTTG 100 
TTCCTCACTG ACCTCACC ATG GCG ACC AGA GCC AAG CTT TCT TTA 145 
30 Met Ala Thr Arg Ala Lys Leu Ser Leu 

5 

GCT ATC TTC CTT TTC TTT CTT TTA GCC TTG ATT TCA AAC CTA GCC 190 
Ala lie Phe Leu Phe Phe Leu Leu Ala Leu lie Ser Asn Leu Ala 
10 15 20 

35 

TTG GGC AAA CTT AAA GAA ACC GAG GTC GAA GAA GAT CCC GAG CTC 23 5 
Leu Gly Lys Leu Lys Glu Thr Glu Val Glu Glu Asp Pro Glu Leu 
25 30 35 

40 GTA ACA TGC AAA CAC CAG TGC CAA CAG CAA CGG CAA TAC ACT GAG 28 0 

Val Thr Cys Lys His Gin Cys Gin Gin Gin Arg Gin Tyr Thr Glu 
40 45 50 

AGT GAC AAG CGA ACA TGC TTG CAA CAA TGT GAC AGT ATG AAG CAA 3 25 
45 Ser Asp Lys Arg Thr Cys Leu Gin Gin Cys Asp Ser Met Lys Gin 

55 €0 65 

GAG CGA GAG AAA CAA GTC GAA GAG GAA ACT CGC GAG AAG GAA GAA 3 70 
Glu Arg Glu Lys Gin Val Glu Glu Glu Thr Arg Glu Lys Glu Glu 
70 75 80 

GAA CAT CAA GAG CAG CAT GAG GAG GAG GAA GAC GAA AAT CCC TAC 415 
Glu His Gin Glu Gin His Glu Glu Glu Glu Asp Glu Asn Pro Tyr 
85 90 95 

GTT TTT GAA GAA GAT AAG GAT TTT TCG ACC AGA GTC GAA ACA GAA 460 
Val Phe Glu Glu Asp Lys Asp Phe Ser Thr Arg Val Glu Thr Glu 
100 105 110 

60 GGT GGC AGC ATT CGG GTT CTC AAG AAG TTC ACT GAG AAA TCC AAG 505 

Gly Gly Ser lie Arg Val Leu Lys Lys Phe Thr Glu Lys Ser Lys 
115 120 125 

CTT CTT CAA GGC ATT GAG AAT TTC CGT TTG GCC ATC TTA GAA GCT 550 
65 Leu Leu Gin Gly lie Glu Asn Phe Arg Leu Ala lie Leu Glu Ala 

130 135 140 



50 



55 
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AGA GCA 
Arg Ala 
145 

5 

GTC TTG 
Val Leu 
160 

10 GAA AGT 

Glu Ser 
175 

CAC ATA 
15 His lie 

190 

AAT GAG 
Asn Glu 
20 205 

CCT GGA 
Pro Gly 
220 

25 

GAA TCG 
Glu Ser 
235 

30 CTC CAA 

Leu Gin 
250 

AAC GAG 
35 Asn Glu 

265 

TTG GCC 
Leu Ala 
40 280 

TCC AAG 
Ser Lys 
295 

45 

AAC GGA 
Asn Gly 
310 

50 AGT TGG 

Ser Trp 
325 

ACC CAG 
55 Thr Gin 

340 

AAG ATA 
Lys lie 
60 355 

TGT CCA 
Cys Pro 
370 

65 



# 



CAC ACG TTC GTG 
His Thr Phe Val 
150 

TTC AAC ATT AAG 
Phe Asn lie Lys 
165 

GAA ACA GAA AAA 
Glu Thr Glu Lys 
180 

CCA GCA GGC ACA 
Pro Ala Gly Thr 
195 

AAG CTC CTC CTT 
Lys Leu Leu Leu 
210 

AAA TTT GAG GAA 
Lys Phe Glu Glu 
225 

GTC CTC TCA GCA 
Val Leu Ser Ala 
240 

ACC CCA AAA GGA 
Thr Pro Lys Gly 
255 

GGA AGT ATT TTC 
Gly Ser He Phe 
270 

CCC ACC AAG AAA 
Pro Thr Lys Lys 
285 

GCT CAA TTC AAT 
Ala Gin Phe Asn 
300 

TAT GGC CGT TTA 
Tyr Gly Arg Leu 
315 

CTT CAA AGA CTC 
Leu Gin Arg Leu 
330 

AGA TCT ATG AGT 
Arg Ser Met Ser 
345 

GCA CTG GTG ATG 
Ala Leu Val Met 
360 

CAC ATG TCA TCA 
His Met Ser Ser 
375 
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TCC CCA CGC CAC 
Ser Pro Arg His 



GGG AGA GCC GTA 
Gly Arg Ala Val 



ATC ACC CTA GAA 
He Thr Leu Glu 



CCA CTG TAC ATC 
Pro Leu Tyr He 

GCC ATG CTC CAT 
Ala Met Leu His 



TTT TTC GGG CCT 
Phe Phe Gly Pro 



TTC AGC TGG AAT 
Phe Ser Trp Asn 



AAG TTA GAA AGG 
Lys Leu Glu Arg 



AAA ATA AGC AGA 
Lys He Ser Arg 

AGC TCT TGG TGG 
Ser Ser Trp Trp 



ATT TTC AGC AAG 
He Phe Ser Lys 



ACT GAA GTT GGT 
Thr Glu Val Gly 

AAC CTC ATG CTT 
Asn Leu Met Leu 



ACT ATT CAC TAC 
Thr He His Tyr 



GAT GGT AGA GGG 
Asp Gly Arg Gly 



AGG TCA GAC TCA 
Arg Ser Asp Ser 




TTT GAT TCC GAG 
Phe Asp Ser Glu 
155 

CTT GGG TTG GTG 
Leu Gly Leu Val 
170 

CCT GGA GAC ATG 
Pro Gly Asp Met 
185 

GTT AAC AGA GAT 
Val Asn Arg Asp 
200 

ATA CCT GTC TCT 
He Pro Val Ser 
215 

GGA GGA CGA GAC 
Gly Gly Arg Asp 
230 

GTG CTG CAA GCT 
Val Leu Gin Ala 
245 

CTT TTT AAT CAA 
Leu Phe Asn Gin 
260 

GAA CGG GTG CGT 
Glu Arg Val Arg 
275 

CCA TTC GGC GGC 
Pro Phe Gly Gly 
290 

CGT CCC ACT TTC 
Arg Pro Thr Phe 
305 

CCT GAT GAT GAA 
Pro Asp Asp Glu 
320 

ACC TTT ACC AAC 
Thr Phe Thr Asn 
335 

AAC TCA CAT GCA 
Asn Ser His Ala 
350 

CAT CTT CAA ATA 
His Leu Gin He 
365 

AAG CAT GAT AAG 
Lys His Asp Lys 
380 
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GTT 595 
Val 



AGG 640 
Arg 



ATA 685 
He 



GAG 73 0 
Glu 



ACT 775 
Thr 



CCA 820 
Pro 



GCG 8 6 5 
Ala 



CAG 910 
Gin 



GCG 955 
Ala 



GAA 1000 
Glu 



TCC 104 5 
Ser 



AAG 1090 
Lys 



ATC 1165 
He 



ACG 1180 
Thr 



TCA 1225 
Ser 



AGT 1270 
Ser 
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AGC CCC TCA TAC CAT AGA ATC AGT GCG GAC TTG AAG CCT GGA ATG 1315 
Ser Pro Ser Tyr His Arg lie Ser Ala Asp Leu Lys Pro Gly Met 
385 390 395 

5 GTG TTT GTT GTC CCT CCT GGT CAT CCC TTC GTC ACT ATA GCT TCC 13 6 0 

Val Phe Val Val Pro Pro Gly His Pro Phe Val Thr lie Ala Ser 
400 405 410 

AAT AAA GAG AAT CTC CTC ATA ATT TGC TTC GAG GTT AAC GTT CGA 14 0 5 

10 Asn Lys Glu Asn Leu Leu lie lie Cys Phe Glu Val Asn Val Arg 

415 420 425 

GAC AAC AAG AAG TTT ACG TTT GCA GGG AAG GAC AAC ATT GTG AGC 14 5 0 

Asp Asn Lys Lys Phe Thr Phe Ala Gly Lys Asp Asn lie Val Ser 
15 430 435 440 

TCT CTG GAC AAC GTA GCT AAG GAG CTG GCC TTT AAC TAT CCT TCT 14 95 

Ser Leu Asp Asn Val Ala Lys Glu Leu Ala Phe Asn Tyr Pro Ser 

445 450 455 

20 

GAG ATG GTG AAC GGA GTC TCC GAA AGA AAG GAG AGT CTC TTT TTC 154 0 

Glu Met Val Asn Gly Val Ser Glu Arg Lys Glu Ser Leu Phe Phe 

460 465 470 

25 CCC TTC GAG TTG CCG AGC GAG GAG CGT GGT CGT CGC GCT GTT GCG 1585 

Pro Phe Glu Leu Pro Ser Glu Glu Arg Gly Arg Arg Ala Val Ala 
475 480 485 

TGA GAAG CAGTGT GGAGGTGGCT GATAACGGGG AATGTATTTA GCTTTGAGAG 163 8 
30 TCTTTAAATT TTCTGTATTT GTTGTAATGT TAGTAGTTCC TTAAATTGGC 1688 

CAGATGGAGT TTATGTGTTT GTAAATG CAG GGATGCTAAC GGAATAAAAT 173 8 

GGCCACTTGT ATTGCTAAAG AAAAAAACCA GCCCGGGCCG TCGACCACGC 178 8 

GTGCCCTATA GTGAGTCGTA TTACAATCGA ATTCCTGCAG CCCGGGGGAT 183 8 

CCACTAGTTC TAGAGCGGCC GCCACCGCGG TGGAGCTCCA GCTTTTGTTC 18 88 

35 CCTTTAGTGA GGGTTAATTT CGAGCTTGGC GTAATC 1924 

(2) INFORMATION FOR SEQ ID NO: 6: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3718 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

i SEQUENCE DESCRIPTION: SEQIDNO:6: 

TTGTAAACGA CGG CCAGTGA ATTGTAATAC GACTCACTAT AGGGCGAATT 50 

45 GGGTACCGGG CCCCCCCTCG AGGTCGACGG TATCGATAAG CTTGATTGTA 100 

ATACGACTCA CTATAGGGCA CGCGTGGTCG ACGGCCCGGG CTGGTCTGAG 150 

AAACT CATT A GGCACTGGAA AATTCTCAAA GGAAATAATG TGAGTCAGCC 200 

AATTCAAACC CACCATATCT TT ATT AAT T T CACTTTTTTC TTTATTTTAT 250 

AATTTTTAGT CTCACAGTCA CACATTTTAA CAGGTTATGA TAACAAGGGG 3 00 

50 CAAAGATAAG GGTGAGACCG GGATTATAAA GCGTGTCATT CGCTCTCAAA 3 50 

ATCGTGTCAT TGTAGAGAGT AAAAACCTGG TGAGAGATAT TAT CATCACA 4 00 

ATTTGGTCCT TCTGTTTTTC TAATGC CCTA TCTTCCTTAG ATTATG TTTT 4 50 

CAATTCCACT GTCAATGTGT CTTG CATCAG AAT ATT AAT C AATTGTGACA 500 

TTGAGCATGT GATTGTGTAA ATTTTCCTGA TAGGTTT CTC ACTCCAATGC 550 

55 CTTTTGTCAT CCTCTTTATA GGTAAAGAAG CATATAAAGC AAGGGCAAGG 600 

TCATGAAGGG GGCATCTTTA CAGTGGAAGC CCCACTGCAT GCCTCCAATG 650 

TGCAAGTTCT TGACCCAGTG ACAGGGTATG TCATTGTTCA GATATTGAAC 700 

TGGTGATTGC AT CTC CAAAC GGGATAACAT CATTAACATG TATGAAAGTA 750 

AGAGTTACCA ACTTTTACTT GTGCAGCAAG CCTTGCAAGG TTGGAGTTAA 8 00 

60 ATATCTTGAA GATGGTACTA AAGTCAGAGT GTCCAGAGGA ATAGGAACCT 8 50 

CAGGGTCCAT AGTCCCTCGT CCTGAGATTT TAAAGATAAG AACTACCCCA 900 

AGACCTG CAG TCCGTAAGTA TCTAACAAGC TTAATTATGC TTTTTCATGT .950 

ATGAGTTGTT GACAAAACAT GGCCAGAGCC AATAGAGAAT CGAGAAAAAG 1000 

TGAGACGGAA AAT G AAC TTG AATTATGAGA AAGGTGTGTG AAACAAACAA 1050 
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GC CAATAATG TGGC TTATAT AATATATAAT ATATAGATAT AG AC CAG AG T 1100 

GAGTAACGAA TCACTAACTA ATTACATGTG TATATCTACC TAATTAGATG 1150 

ACTCATCAAA CAAAGCGAAC TATTG TGATA GAGACTTTAT TTTTCGCAAT 120 0 

TAATTCAAAG ATGTACTGCT TATCTTCTTT GCTACATGTC TGT TGACATG 1250 

5 CATTGTTATC CATAACC TTG TTATTATACT TGGTGTTGAG AAAGG AG AG T 13 00 

CTCCTTGCAC TTTAGAGACA TTCTTTAAAC TGACTTGACC TTATTGAAAA 13 50 

TTCGAGATAG CAACTTAG CA CCACACCTTA AAAAGAAAGA TTTTTTAGAG 14 00 

GGTAGATTAA TTGTTGAATA ATGTTAAT CA TCAAAGGTTT AAGATTTATT 14 50 

AAGTGCTTTC CATTGTCTTA AAAATCTTGC TTCTAGGACT AGGATGTGTA 150 0 

10 TTGT TACATG ATTTCCCCCC CTTGGTATCA ACTAAAGCAT GTTGGACTTG 1550 

CG CT C CAT AT GCAGAAACTC AAATTAAAAA CAT CATTTGT AATGTATAGT 1600 

AAGTGTATAT ATAACATTGT AAGTTGTCGA TCAAAGTTAT TTGGATTAAT 1650 

GGATTTAAGT CTTCTATAAT ATT C CATTGA GAGCCAGAAG CCAGGT CCAA 1700 

AGGAATAAGT AACTCGCATG AATTCATT CT CTTGCTTCTA TACAGCTATT 1750 

15 TTTCCATCTT AGTGTTGCGG GAAACTACTT CAGTTCTCGC AG ATGTG CAA 1800 

AACTTGTAGG GATCCATGTA GTTCAGTGAA ACCCATG C TT TCTTAATTGA 1850 

CAGAGATACA TTAAAACTTT TTACAGAATT GAGAAACCCA AG CCTTGTTA 1900 

ATTCTCAAAG ATACATTTAA ACTTTTTTCA G AAACGTG CT GAGTATTTTA 1950 

TCCTGTTTGT TATTCATTTT TGGCAGTTGG TCCTAAAAAT ACTCCTATGA 2 00 0 

20 ATCTTGTGCT AGAGAAGACT TGCAATGCTA AAACAGGACG GGGCATGCCT 2050 

GAACTTTAAG GAGACGTTGC CTTGTTCCGA TTAGGTAATT GCTATCGTGA 2100 

TGAACAAAAA TTTGGTGTGA ATTTATCCCC TTGCCCTTTG CCATGATTCA 2150 

ATTAAAGACG TGTTTGGAAC CACATT CTAA CACCACTTTA TGATGGGTTA 2200 

GACGCAAAAT CTAGATTGGG TAGTGTTTAC ACACAGTTAC AAACACATTC 2250 

25 CTTG TTTAAT GTTATCATGC CTAGGAGTTG AATAACTTGT AACTTTACCA 23 00 

ATTAGACATT ACTACTAG CA TTCTTTTTCC TATTCAAGTT GATGTTATCT 23 50 

CCAGTTAGTG ATGGTCATTT CATTC CATAA ACTTCAATTG TTAAAATGAG 24 00 

TGAAAAGGGA AAAAGGAACC CGTTTGATTG TTATGGTTCT AGTGATTTTT 2450 

ATTAATTGGG TTTGTCCATT AGTGTCGATT TGAGCTAAAT AGTTTCCCCC 2S00 

30 CCCCAAAAGA TCAGTCTTCT CACATGTCAT ATTCATGCGC TGGTACCCTT 2550 

TTCATC CAG T TCCAACAAAC TTGCTGTACG AAGT CAGGTT GCATGAAAAT 26 00 

AGTCAAATTT TCTTTAAGGG GGATATTATA CGTAAATAAA TAACGTAACC 2650 

CAAAAGTCTT ACTTGTTGGG TAACGTGGGT TTTGGTGTTT GATGGACCTA 2 700 

GAACACTGTT TGTTGCTCTT ATATGCTTAC AAAGTAAAAA TGGTTATCAC 2750 

35 ATTTGGGGAA AAAATGTAGG CCCACTTATG ATATTTCGAC CTAAATG CAA 2 800 

AATGGTTTAT CAATTTTTTT ATACTTAGTA TGATAAAACT C CTTTTTT TT 2 850 

TTCCACTGGC ATACTATTTC T CT AAGACTT TTTAATAGTT CCGATAATTC 2 900 

TTAG CTTAAA GAAATACGAC AAGGTTAGGA ATATTTTTTT ATTATGTGAC 2 950 

ATTAT TTTTT AAATATTTTG CTTCATATGA ATTTATACAA T CAT T AT AAT 3 0 00 

40 TTGACCTTTT AAATGACTTT TAAAAATGAT CAGACCTAAA ATTTGAGTCT 3 050 

TCTGATTGAG ATGCAAACTT ATTTCTTTTT ATATTTTATA TTTTATACTC 3100 

ATTTGTTTCT CTTTCTATTA TATTTCTTTT TTTTCTTCTC TTTATG CAAA 3150 

AACGTATGAC GTTGATTGGT GTCTTTGGCA ATCTTTTTAT GACGCTCAAA 3 2 00 

AGTGAAAATA AATATTGTTC ACTTTCACCT CACGCTGGCC TTCCGCTGAT 3 250 

45 GGTGGTTGTA CGCACTTATT TGATTTTTTT TTCTTCCACA TTTAATGAGG 3 3 00 

TGAATCAGTT AGAGAAATAT TAAAAAAAAT AAATAAATAA AGGAAGACGA 3 3 50 

CTAATACAAT AAAGAATACG AAACTCACAA TGAATAGACC CAATTAGAAC 34 00 

CATTTATTTT CCTTACAAAT TAAAGAAAAC GTTTTTTTAA CAAT AT AT CA 3 450 

CATTATCATC TATTATATTT TTATTTATAT TTTTTATAAC TTTCTCTATC 3 500 

50 TAGGTGTAGA TTGACATGAG TATACGCACG CACACCCAGC T CTACT TAG C 3 550 

AG CAATTACC CGTTTTACTT GCTACTTAAG AGACACGTAC ATTAACACTT 3 6 00 

GTCCTTGTGC ATG CAATTG C CACCACATTC CTCACTCCAC CCTTTTCTTT 3 6 50 

ATATATAAAC AAACACAATG GATCATCTCA AACCAAGAGT GAGTTTGTTG 3 7 00 

TTCCTCACTG ACCTCACC 3 718 

55 

(2) INFORMATION FOR SEQ ID NO: 7: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4476 
60 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:7: 

TTGTAAAACG ACGGCCAGTG AATTGTAATA CGACTCACTA TAGGGCGAAT 5 0 
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TGGGTACCGG GCCCCCCCTC GAGGTCGACG GTATCGATAA GCTTGATTAC 100 
TATAGGGCAC GCGTGGTCGA CGGCCCGGGC TGGTACTTTT GACTCCCTAA 150 

TTGACAACTA CTGCATTGTA TCGATATTAA TATGGAATTT GGAATCATGG 2 00 
TCCATGCTTC ATGTATTGTG TACCTCATAT TCAACAGCTA GTGAACACAA 250 
5 AATCTTACAT ACTT TTGTAT TTCTATCAGT TTATACCTTC CCAAATAAAT 3 00 

GGCTTATATT GCATTGAGTT ACATATTATT GTTTAGTTGG ATTGTAATTT 3 50 

ACGAGTAGTT TGTCACGACT GAAGAAATTA ATAAGGTATA AGACACGTCC 4 00 

TGCTCCCGCG AAATTCATTT TCTGTTTATT CTCTGTCTCT GTCTCTATTC 4 50 

AATT CAACCT TC CAT TTGTT TTCGCCAGCA TCCAGATTTG TGCTTTCTCT 500 

10 ATCATTTCAT TTAATTAATG TGATGTATGT ATGGCTGAAT AAAAGATGGA 550 

TTCCTCTTTT TTGTGGGGTG GAAG CTTAAT CTATGGGGCT AGATAAAAAA 6 00 

ATTCATCTGT TTGTTGCACA GAATAAAATA TAAATTAATA ATTAATTAAA 650 
CTTCAAACAT GGACAGGGCA CCTCCAAGTT ATTTTAAAAC CGACCATGGC 700 
CATTTTTGCT TTCTGTTGGT GTTCTTGGCT CAGCTTTTGT AATTTTAGAC 750 

15 TGCAGAAACA TCCTGTATGG GTTGGAAAGC AG CTG AG AAA CTCATTAGGC 8 00 

GCTGGAAAAT TCTAAGAGGG GATAATGTAT GTGAGTCAAT TCAAACCCAC 8 50 

CATATGTTTG TCTCTGTGCT CTTTATTAAT TTCACTTTTT TATTTTATAA 9 00 

TTTTAGTCTC ACAGT CACAG AGTCACTTAT GTATT CAT CT AACAGGTTAT 9 50 

GATAACAAGG GGCAAAGATA AGGGTGAGAC CGGGATTATA AAGCGTGTCA 10 00 

20 TTTGCTCTCA AAATCGTGTC ATTGTAGAGG G TAAAAAT CT GGTGAGATAT 10 50 

TATAATCACT ATTTGGTCCT TCTGTTTTTC TAATGCCCTA TCTTCTGTAG 1100 

CTTTTGTTTT CAATTCCACT GTCAGTGTGT CTTGCATCAG AATATTAATC 1150 

GGTTGTCAGT GACATTGAGC ATTTAATTGT GTAAATTTTC CTGTTAGATT 12 00 

TCTCACTCCA ATGCCTTTTG CCGTCCTCTT TATAGG TAAA GAAG CATAT C 1250 

25 AAGCAAGGGC AAGGTCATAA AGGGGGAATC TTTACAGTGG AAGCCCCACT 13 00 

GCATGCCTCC AATGTG CAAG TTCTTGACCC AGTGACAGGG TATGTACATG 13 50 

TTAGATATTG AACTGGTGAT TGCTTCTCCA AATGGGATAA CATGTATGTA 14 00 

AGTAAGAGTA ACCTACTTTT ACTTGTGCAG CAAGCCTTGC AAGGTTAGAG 14 50 

TTAAAATATC TTGAAGATGG TACTAAAGTC AGAGTGTCCA GAGGAATAGG 1500 

30 AG CATCAGGG TTCATAGTCC CTCGTCCCAA GAT CTTAAAG ATAAGAACTA 15 50 

CCCCAAGACC TACAGTCCGT AAGTATCTAA CAAGCTTATG TTTTTTCCTT 1600 

GTATGAGTTG TTGATAAAAC ATGGCCAGAG CCAATAGAGA ATTGAGAAAA 1650 

GGTGAGAAAC AGAAAATGAA CTTGAATTAT GAGAAAGGTG TGGGAAACAA 1700 

ACAAGCCAAT AATGTGGCTT ATATAATATA TAGATATAGA C TAGAGTGAG 1750 

35 TAACGAATCA CTAACTAATT ACATGTG CAT ATCTACCTAA TTAGATGATT 1800 

CGTCAAACGA AG CAAAGTAT TGTGATAGAT AGTTGATTTT T CT CAAATAA 18 50 

TTCTAAGATG TAATACTTAT ATTCTTTGCT ACATGT CTG T TGACATACAT 1900 

TGTTATCCAT AACCTTGTTA TTATACTTGG TGTTAAAAAA GGAGAGTCTC 19 50 

CTTGCACTTT AG AG ACAT T C TTTAAACTGA CTTGACCTTA TTGAAATACA 20 00 

40 TAATTCTAGT TACCAACTTA GCACCACACC ATAAAAGGAA AGATTTTTAA 20 50 

ACGGTAGATT GATTGTTGAA TAATGTTAAT CATCAAAGGT TTAAGATTTA 2100 

TTAAGTGCTT TCCATTGTCT TAAAATATTG CTTCTAGGAC TAGGATGTGT 2150 

ATATTGGTTA CATGATTTCC CCGCCTTCGT ATCAACT TAA G CATGTTGG A 22 00 

CTTGCACCCA TATGCAGAAA CTCAAATAAA AAACTTCATT TGTAAGGTAT 22 50 

45 AATAAGTGTA TATATAACAT TGTAAGTTGT CAATCAGAGT AATTTGGATT 23 00 

G ATGGAT AT T TAAGTCTTCT ATAATATTTC ATTTAGAGCC AGAAGCCAGG 23 50 

TTCAAAGGAA TAGGTAATTC ACATGAATTC ATTCTCTTGT TTCTATACAG 24 00 

TTATTATTTT TTC CAT CTTA GTGTTGCAGG AAACTAC CT C AGTTGTTGTA 24 50 

GATGTGCAAA ACTTGTATGG ATATATATAC TGTTCAGTGT TGGGAAACCC 2500 

50 ATGCTTTCTT AATTCACAGA GATACATTTA AACTTTTTTT AGAAACTTGC 2550 

TTAGTATCTT ATCCTGTTAT TCATTTTTGG CAGTTGGTCC TAAAGATACT 26 00 

CCTATGAATC TTGTGCTAGA GAAGACTTAC GATG CTAAAA CAGGACGGGG 2650 

CATGCCTGAA CTTTAAGGAG ACGTTGCCCT GTTCCACTTC CAATTAGGTA 2700 

ACTGCTATCG TGATGAACAA AAAT TTGGTG TGAGTTTATC AC CTTGTC C T 2750 

55 TTGCCATGAT TCAATTAAAA GCGTGTTTGG ACTTTGGAAC CT CATTCTAA 2 80 0 

CACCACCCTA TGATGGGTTA GACGCAAAAT CTAGACTGGG TAGTGTTTAA 2850 

CGTGTATCTG TGTGAACACA GTTACAAACG CATTCCATGT TTAATGCTAC 2900 

CATG CCTAGG AGTTGAATCA TTTGTAACTT TACCAATTTA GTCATTACTA 2950 

CTAGCATTCT TTTCCCTATT CAAGTTGATG TTAGCTCCAG TTAGGGATGG 3 000 

60 TCATTTCACT CCATAAACTT TAATTGTTAG GTGAGTGGAA GAGGAACCCG 3 050 

TTTGATTGTT ATGGTTCTAG TTCTAGTGAT TTTTATTAAT TGGGTTCGAC 3100 

CATATTAGTG TTTGATTTGA GCTATAGATA GTTTTTTCCC CAAAAGATCA 3150 

GTTTTCTCAC ATGTCAGATT CATGGGTTGG TACTCTTTTC ATCCAGTTCC 3200 

AACAAACTTG CTGTTCGAAC TACGAAGTCA GTCTTACTTA TTGGGTAACA 3250 

65 TGTGGGTTTT GGTGTTTAAT GG AT CTAGAA TACTGTTTGT AGCTAAACCT 3 3 00 
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ATCTTATCAT ATAGGGCCTA AAAAGTAAAA TTGGTTATTA CATTTGGAAA 3 3 50 

AAAAGAAATA ATCTAGGCCC AC TGG CACAC TGAAAAACGT TTTCAATGAA 3400 

TAATTTAATA GTTTTTTTTT TATAAAAAAA T TTT AAT AAA AAATAATGGA 3 4 50 

GTTTTTAAAA ATATTACAAC AATCTGTTTC TCTAAGGTTT T T TAATAGTT 3 500 

5 CAGATAATTC ATAGCTTAGA GCAATACGAC ATGGTTAGGA AGCATAAAAA 3 550 

AAATATACGA CATGGTTAGG AATTTTTTTT TAGTATGTCT GACATAATTT 3 6 00 

TTTAAATGTT TTGGCTTCAT ATGAATTTAA CAGTGCGTCA TATGAACTTA 3 6 50 

CACACTCATT ATATTTTTTA ACCTTTTAAA TGATTTTTAA AAAATATGAC 3 650 

AG ATGCAAT C TTATTCTCAC TTTTTATACT TTCACTACTG CTT CATATGA 3 70 0 

10 CCTAAAGTCA GAGAAATATT TTAAAAAGAT AAATACGATA AAGAATACGA 3 750 

TGAGAAAGAA ACCTCACACA ATGAATAGAC CAAAT TAG AC CTATTTATTT 3 8 00 

TCCTTAGAAA TAAAGAAAAT AATTATTTTT TATTTTTTCA CAT T ACATTT 3 850 

ATATTTTTCT ATCACTTTCT CTATTTAGGT ATTGATTGAC ATATGAGTGT 3 900 

ACATGAACTT TTTTTAAAAA AAAAGCGTAA ATATTAATTA TATTCATGCA 3 950 

15 TTTGTTTTCT GTCTTTCATT TTCTATTTAA TCTTACGTTA TCAATAATCT 4 000 

ATTATTAAAT TTTATAGTTG ATGATGAATA TATAAGAGAT ATAAATAAAA 4 050 

AAATAATTAA TTTTATAATA AAAATTAAAA AATAATTAAT TATTTTGAGA 4100 

TAAATTTTTT TTAAGAGAAC AATTATAAAC GGAGAGTATT ATATTTAGTT 4150 

TTATGTGTAC CGGGTACGTG TCTACTAACA TGGTGTCTCT C CAT CATTTT 4 2 00 

20 CGTAGGAAAA AACATTATAG GAGTATGAAA AAAGCAAAAG TTTTGTCTGT 4 2 50 

TTATGGTTTT GTATATACCC AGCTCTACTT GGCAGCAATT ACCCGTCTTG 4 3 00 

CTTG CTACTT ACGAGACACG TACATTAACA CTTGTCCTAG CTAGTG CATG 4 3 50 

CAATTGCCAC CCCATTCCTC ACTCCTCCCT TTTCCTTCTC TTTATATTTA 44 00 

TATATATAAA TAAACAAACA CAATGCATCA TCTCAAAGAA ATTAAGAGAG 4 450 

25 TTTTTTTGTT CCTCACTGAC CAAGCC 4476 

(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:8: 
TGTAAAACGA CGGCCAGTGA ATT 23 

35 

(2) INFORMATION FOR SEQ ID NO: 9: 
(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

40 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQIDNO:9: 
GATTACGCCA AGCTCGAAAT TAA 23 



45 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
50 (D) TOPOLOGY: linear 

i SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATGGCGACCA GAGCCAAGCT TTCTTTA 2 7 



(2) INFORMATION FOR SEQ ID NO: 1 1 : 
55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

60 (xi) SEQUENCE DESCRIPTION: SEQIDNO:ll: 
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CGCAACAGCG CGACGACCAC GCTCGCT 2 7 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 27 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
10 ATGGCGACCA GAGC CAAGCT TTCTTTA 2 7 

(2) INFORMATION FOR SEQ ID NO: 1 3 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQIDNO:l3: 
GAAGGGATGA CCAGGAGGGA CAACAAA 2 7 

20 (2) INFORMATION FOR SEQ ID NO: 1 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:14: 
TTGTAAACGA CGGC CAGTGA ATT 23 

(2) INFORMATION FOR SEQ ID NO: 15: 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQIDNO:15: 

GGTGAGGTCA GTGAGGAACA ACA 23 
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CLAIMS 

1 . A modified plant sucrose binding protein wherein the modified sucrose 
binding protein has a modified amino acid sequence compared to a corresponding 
wild-type sucrose binding protein, and wherein expression of the modified sucrose 
binding protein in a yeast assay system confers enhanced sucrose compared to the 
corresponding wild-type sucrose binding protein. 

2. A modified plant sucrose binding protein according to claim 1 wherein 
the modified sucrose binding protein enhances sucrose uptake in the yeast assay 
system by at least 1 0% compared to the wild-type sucrose binding protein. 

3. A modified plant sucrose binding protein according to claim 1 wherein 
the modified sucrose binding protein enhances sucrose uptake in the yeast assay 
system by at least 25% compared to the wild-type sucrose binding protein. 

4. A modified plant sucrose binding protein according to claim 1 wherein 
the modified amino acid sequence comprises a Oterminal truncation compared to 
the wild-type sucrose binding protein. 

5. A modified plant sucrose binding protein according to claim 4 wherein 
the C-terminal truncation results in removal of between 10 and 100 amino acids. 

6. A modified plant sucrose binding protein, wherein the correspond ing 
wild-type sucrose binding protein is selected from the group consisting of SBP 1 and 
SBP2. 

7. A modified plant sucrose binding protein according to claim 6 wherein 
the protein has an amino acid sequence selected from the group consisting of Seq. 
I.D. Nos. 2 and 4. 

8. A nucleic acid molecule encoding a modified plant sucrose bin ling 
protein according to claim 1 . 
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9. A vector comprising a nucleic acid molecule according to claim 8. 



10. A transgenic plant expressing a modified plant sucrose binding protein 
according to claim 1 . 



5 



11. A nucleic acid molecule encoding a modified sucrose binding protein 
according to claim 6. 

12. A transgenic plant expressing a modified plant sucrose binding protein 
10 according to claim 6. 

13. An isolated nucleic acid molecule encoding a plant sucrose binding 
protein, wherein the protein comprises an amino acid sequence selected from the 
group consisting of: 

15 (a) the amino acid sequence set forth in Seq. I.D. No. 3; 

(b) the amino acid sequence set forth in Seq. I.D. No. 4; 

(c) amino acid sequences having at least 70% sequence identity with the 
amino acid sequence of (a) or (b); and 

(d) amino acid sequences having at least 90% sequence identity with the 
20 amino acid sequence of (a) or (b). 

14. A recombinant expression cassette comprising a promoter sequence 
operably linked to a nucleic acid molecule according to claim 13. 

25 15. A transgenic plant comprising a recombinant expression cassette 

according to claim 14. 

16. A recombinant nucleic acid molecule comprising a promoter sequence 
operably linked to a nucleic acid sequence, wherein the promoter sequence 
30 comprises a SBP1 or SBP2 promoter. 
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17. A recombinant nucleic acid molecule according to claim 16 wherein 
the promoter sequence comprises at least 25 consecutive nucleotides of a sequence 
selected from the group consisting of: 

(a) Seq. LD. No. 7; and 
5 (b) Seq. LD. No. 8. 

18. A recombinant nucleic acid molecule according to claim 17 wherein 
the nucleic acid sequence encodes a plant sucrose binding protein. 

10 19. A transgenic plant comprising a recombinant nucleic acid molecule 

according to claim 17. 

20. A transgenic plant comprising a recombinant nucleic acid molecule 
according to claim 18. 
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sbpl MAMRT KLSLA I F F FFLLAL F SNLAFGKCKE TEVEEEDPEL 4 0 

Sbp2 MATRAKLSLA I FLFFLLALI SNLALGKLKE TEV , EEDPEL 3 9 

sbpl VTCKHQCQQQ QQYTEGDKRV CLQSCDRYHR MKQEREKQIQ 8 0 

sbp2 VTCKHQCQQQ RQYTESDKRT CLQQCD...S MKQEREKQVE 7 6 

sbpl EETREKKEEE SREREEEQQE QHEEQDENPY I FEEDKDF ET 120 

sbp2 EETREKE EEHQEQ HEEEEDENPY VFEEDKDFST 109 

sbpl RVETEGGRIR VLKKFTEKSK LLQGIENFRL AILEARAHTF 160 

sbp2 RVETEGGSIR VLKKFTEKSK LLQGIENFRL AILEARAHTF 14 9 

QR P 

■k -k * 

sbpl VSPRHFDSEV VF FNIKGRAV . LGLV S ESETE KITLEPGDMI 200 

sbp2 VSPRHFDSEV VLFNIKGRAV LGLVRESETE KITLEPGDMI 189 

sbpl HIPAGTPLYI VNRDENDKLF LAMLHIP VSV STPGKFEEFF 240 

sbp2 HIPAGTPLYI VNRDENEKLL LAMLHIP . . V STPGKFEEFF 227 

sbpl GPGGRDPESV LSAFSWNVLQ AALQTPKGKL EKLFDQQNEG 280 

sbp2 GPGGRDPESV LSAFSWNVLQ AALQTPKGKL ERLFNQQNEG 2 67 

sbpl SIFAISREQV RALAPTKKSS WWPFGGESKP QFNI FSKRPT 320 

sbp2 SIFKISRERV RALAPTKKSS WWPFGGESKA QFNI FSKRPT 307 

★ -k -k 

sbpl ISNGYGRLTE VGPDDDEKSW LQRLNLMLTF TNITQRSMST 360 

sbp2 FSNGYGRLTE VGP.DDEKSW LQRLNLMLTF TNITQRSMST 346 

G 



FIG. 1(a) 
1/3 
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sbpl IHYNSHATKI ALVIDGRGHL QISCPHMSSR SSH SKHDKSS 4 00 

Sbp2 IHYNSHATKI ALVMDGRGHL QISCPHMSSR SD.SKHDKSS 385 
P 



sbpl PSYHRIS SDL KPGMVFVVPP GHPFVTIASN KENLL M ICFE 440 
sbp2 PSYHRISADL KPGMVFVVPP GHPFVTIASN KENLLI ICFE 425 



sbpl VNA RDNKKFT FAGKDNIVSS LDNVAKELAF NYPSEMVNGV 4 80 

sbp2 VNVRDNKKFT FAGKDNIVSS LDNVAKELAF NYPSEMVNGV 4 65 

sbpl FLLQRFLERK LIGRLYHLPH KD RKES FFFP FELP R EERGR 520 

sbp2 SERKESLFFP FELPSEERGR 485 



sbpl RADA* 524 
sbp2 RAVA* 4 89 



FIG. 1(b) 
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